Asset Management

Asset Management
Paolo Vanini
University of Basel
October 22, 2016
Contents
1 Introduction and Summary
1.1 Game Changers . . . . . . . . . . .
1.2 Regulation and Technology . . . .
1.3 Fundamental Issues in AM . . . . .
1.4 Investment Theory Synthesis . . .
1.5 Global Asset Management Industry
1.6 Varia . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 Fundamentals
2.1 Returns . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Time Value of Money . . . . . . . . . . . .
2.1.2 Returns and Return Attribution . . . . . .
2.1.3 Returns and Leverage . . . . . . . . . . . .
2.2 Investors . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Sovereign Wealth Funds (SWFs) . . . . . .
2.2.2 Pension Funds . . . . . . . . . . . . . . . .
2.2.3 Management of Pension Funds . . . . . . .
2.2.4 Private Investors . . . . . . . . . . . . . . .
2.2.5 Summary . . . . . . . . . . . . . . . . . . .
2.3 The Efficient Market Hypothesis (EMH) . . . . . .
2.3.1 Predictions . . . . . . . . . . . . . . . . . .
2.3.2 Importance of EMH for Asset Management
2.3.3 Evidence for the EMH . . . . . . . . . . .
2.4 Wealth of Nations . . . . . . . . . . . . . . . . . .
2.5 Who Decides? . . . . . . . . . . . . . . . . . . . . .
2.5.1 MiFID II . . . . . . . . . . . . . . . . . . .
2.5.2 Investment Process for Retail Clients . . . .
2.5.3 Mandate Solutions for Pension Funds . . . .
2.5.4 Conduct Risk . . . . . . . . . . . . . . . . .
2.6 Risk, Return, and Diversification . . . . . . . . . .
2.6.1 Risk Scaling . . . . . . . . . . . . . . . . . .
2.6.2 Long Term Investment and Retirement Risk
2.6.3 Costs and Performance . . . . . . . . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
10
11
13
15
19
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
23
23
27
31
32
33
33
39
45
47
48
53
58
59
62
64
67
71
73
74
77
92
94
95
CONTENTS
2.7
2.8
2.9
2.10
2.11
2.12
2.13
2.14
2.6.4 A First Step toward Passive versus Active Investment .

Foundations of Investment Decisions . . . . . . . . . . . . . . .
2.7.1 Statistical Models . . . . . . . . . . . . . . . . . . . . .
2.7.2 Heuristic Models . . . . . . . . . . . . . . . . . . . . . .
Portfolio Construction . . . . . . . . . . . . . . . . . . . . . . .
2.8.1 Steps in Portfolio Construction . . . . . . . . . . . . . .
2.8.2 Static 60/40 Portfolio . . . . . . . . . . . . . . . . . . .
2.8.3 Factor Models . . . . . . . . . . . . . . . . . . . . . . . .
2.8.4 Optimal Portfolio Construction: Markowitz . . . . . . .
2.8.5 Optimization, SAA, TAA and Benchmarking . . . . . .
2.8.6 Risk-Based Portfolio Construction . . . . . . . . . . . .
Factor Investing . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.9.1 The CAPM . . . . . . . . . . . . . . . . . . . . . . . . .
2.9.2 Fama-French 3- and 5-Factor Models . . . . . . . . . . .
2.9.3 Factor Investment - Industry Approach . . . . . . . . .
Views and Portfolio Construction - The Black-Litterman Model
2.10.1 Mixed Models Logic . . . . . . . . . . . . . . . . . . . .
2.10.2 Black-Litterman Model . . . . . . . . . . . . . . . . . .
Active Risk-Based Investing . . . . . . . . . . . . . . . . . . . .
2.11.1 Implicit Views . . . . . . . . . . . . . . . . . . . . . . .
2.11.2 Active Views . . . . . . . . . . . . . . . . . . . . . . . .
Entropy Pooling Model . . . . . . . . . . . . . . . . . . . . . . .
2.12.1 Factor Entropy Pooling . . . . . . . . . . . . . . . . . .
CIO Investment Process . . . . . . . . . . . . . . . . . . . . . .
Simplicity, Over-simplicity, and Complexity . . . . . . . . . . .
2.14.1 The Faber Model . . . . . . . . . . . . . . . . . . . . . .
2.14.2 Statistical Significance . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Investment Theory Synthesis

3.1 Modern Asset Pricing and Portfolio Theory . . . . . . . . . . . . . . . .
3.1.1 Absolute Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Simple General Equilibrium Model . . . . . . . . . . . . . . . . .
3.1.3 Relative Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Absolute Pricing: Optimal Asset Pricing Equation . . . . . . . . . . . .
3.2.1 Equivalence: Discount Factors, Risk Factors, and Mean-Variance
Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Multi-Period Asset Pricing and Multi-Risk-Factors Models . . . .
3.2.3 Low Volatility Strategies . . . . . . . . . . . . . . . . . . . . . . .
3.2.4 What Happens if an Investment Strategy is Known to Everyone?
3.3 Absolute Pricing: Optimal Investment Strategy and Rebalancing . . . .
3.3.1 General Rebalancing Facts . . . . . . . . . . . . . . . . . . . . . .
3.3.2 Convex and Concave Strategies . . . . . . . . . . . . . . . . . . .
3.3.3 Do Investors Rebalance (Enough)? . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
96
97
98
106
114
114
115
119
134
162
164
173
173
183
189
198
198
199
202
202
204
206
206
207
211
213
213
.
.
.
.
.
217
217
217
218
220
221
.
.
.
.
.
.
.
.
230
233
235
238
240
241
244
246
CONTENTS
3.3.4 Rebalancing = Short Volatility Strategy . . . . . . . . . . . . . . .
3.3.5 Rebalancing: A Source for Portfolio Return? . . . . . . . . . . . .
3.4 Short-Term versus Long-Term Investment Horizons . . . . . . . . . . . .
3.4.1 Questions and Observations . . . . . . . . . . . . . . . . . . . . .
3.4.2 Short-Term versus Long-Term Investments in the Great Financial
Crisis (GFC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Time-Varying Investment Opportunities . . . . . . . . . . . . . . .
3.4.4 Practice of Long-Term Investment . . . . . . . . . . . . . . . . . .
3.4.5 Fallacies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Risk Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1 Returns and Risk Factors Sorting . . . . . . . . . . . . . . . . . .
3.5.2 Sustainability of Risk Factors . . . . . . . . . . . . . . . . . . . . .
3.6 Optimal Investment - The Herding of Pension Funds . . . . . . . . . . . .
3.7 Alternatives to Rational Models - Behavioral Approaches . . . . . . . . .
3.8 Real-Estate Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.8.1 US market: Repeated Sales Index versus Constant Quality Index .
3.8.2 Constant Quality Index: Greater London and Zurich Area . . . . .
3.8.3 Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9 Relative Pricing - No Arbitrage . . . . . . . . . . . . . . . . . . . . . . . .
3.9.1 Main Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9.3 CAPM and No Arbitrage . . . . . . . . . . . . . . . . . . . . . . .
3.9.4 Arbitrage Pricing Theory (APT) . . . . . . . . . . . . . . . . . . .
3.10 Four Asset Pricing Formulae . . . . . . . . . . . . . . . . . . . . . . . . . .
4
5
247
248
253
253
253
254
257
258
259
259
260
264
266
268
268
270
271
273
273
277
278
278
281
Global Asset Management

283
4.1 Asset Management Industry . . . . . . . . . . . . . . . . . . . . . . . . . 284
4.1.1 The Demand Side . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
4.1.2 The Supply Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
4.1.3 Asset Management Industry in the Financial System - the Eurozone284
4.1.4 Global Figures 2007-2014 . . . . . . . . . . . . . . . . . . . . . . . 286
4.1.5 Asset Management vs Trading Characteristics . . . . . . . . . . . . 288
4.1.6 Dynamics of the Asset Management Industry . . . . . . . . . . . . 289
4.1.7 Institutional Asset Management versus Wealth Management . . . . 289
4.2 The Fund Industry - An Overview . . . . . . . . . . . . . . . . . . . . . . 290
4.2.1 Types of Funds and Size . . . . . . . . . . . . . . . . . . . . . . . . 290
4.3 Mutual Funds and SICAVs . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
4.3.1 US Mutual Funds versus European UCITS . . . . . . . . . . . . . 295
4.3.2 Functions of Mutual Funds . . . . . . . . . . . . . . . . . . . . . . 296
4.3.3 The European Fund Industry - UCITS . . . . . . . . . . . . . . . . 299
4.3.4 Active vs Passive Investments: Methods and Empirical Facts . . . 301
4.3.5 Fees for Mutual Funds . . . . . . . . . . . . . . . . . . . . . . . . . 310
4.4 Index Funds and ETFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
CONTENTS
4.4.1
4.4.2
4.4.3
4.4.4
Capital Weighted Index Funds . . . . . . . . . . . . . . . . . . .

Risk Weighted Index Funds . . . . . . . . . . . . . . . . . . . . .
ETFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Evolution of Expense Ratios for Actively Managed Funds, Index
Funds and ETFs . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Alternative Investments (AIs) - Insurance-Linked Investments . . . . .
4.5.1 Asset Class Transformation . . . . . . . . . . . . . . . . . . . . .
4.5.2 Insurance-Linked Investments . . . . . . . . . . . . . . . . . . .
4.6 Hedge Funds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.1 What is a hedge fund (HF)? . . . . . . . . . . . . . . . . . . . . .
4.6.2 Hedge Fund Industry . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.3 CTA Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.4 Fees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.5 Leverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.6 Share Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.7 Fund Flows and Capital Formation . . . . . . . . . . . . . . . . .
4.6.8 Biases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.9 Entries and Exits . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.10 Investment Performance . . . . . . . . . . . . . . . . . . . . . . .
4.7 Event-Driven Investment Opportunities . . . . . . . . . . . . . . . . . .
4.7.1 Structured Products . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.2 Political Events: Swiss National Bank (SNB) and ECB . . . . . .
4.7.3 Opportunities to Invest in High Dividend Paying EU Stocks . . .
4.7.4 Low-Barrier BRCs . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.5 Japan: Abenomics . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.6 Market Events . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.7 Negative Credit Basis after the Most Recent GFC . . . . . . . .
4.7.8 Positive Credit Basis 2014 . . . . . . . . . . . . . . . . . . . . . .
4.8 The Investment Process and Technology . . . . . . . . . . . . . . . . . .
4.8.1 Infrastructure Layer . . . . . . . . . . . . . . . . . . . . . . . . .
4.8.2 Asset Management Challenges . . . . . . . . . . . . . . . . . . .
4.9 Trends - FinTech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.9.1 Generic Basis of FinTech . . . . . . . . . . . . . . . . . . . . . .
4.9.2 Investment Management . . . . . . . . . . . . . . . . . . . . . . .
4.9.3 Market Provisioning . . . . . . . . . . . . . . . . . . . . . . . . .
4.9.4 Trade Execution - Algo Trading . . . . . . . . . . . . . . . . . . .
4.10 Trends - Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.10.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.10.2 Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.10.3 Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.11 Trends - Blockchain and Bitcoin . . . . . . . . . . . . . . . . . . . . . . .
4.11.1 Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.11.2 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 314
. 317
. 317
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
325
326
326
327
330
330
332
335
336
337
338
338
339
340
341
348
349
349
351
352
353
354
354
356
356
358
359
360
362
365
369
370
372
372
380
381
382
382
386
CONTENTS
4.11.3 Examples . . . . . . . . . . . . . . . . . . . . .
4.11.4 Different Currencies . . . . . . . . . . . . . . .
4.11.5 Bitcoin . . . . . . . . . . . . . . . . . . . . . .
4.11.6 Future of Blockchain and Bitcoins . . . . . . .
4.11.7 Alternative Ledgers - Corda . . . . . . . . . . .
4.12 Trends - Demography and Pension Funds . . . . . . .
4.12.1 Demographic Facts . . . . . . . . . . . . . . . .
4.12.2 Pension Funds . . . . . . . . . . . . . . . . . .
4.12.3 Role of Asset Management . . . . . . . . . . .
4.12.4 Investment Consultants . . . . . . . . . . . . .
4.13 Trends - Uniformity of Minds . . . . . . . . . . . . . .
4.13.1 The Great Depression and the Great Recession
4.13.2 Uniformity of Minds . . . . . . . . . . . . . . .
7
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
390
394
398
403
404
405
405
410
410
413
414
414
415
5 Appendix
419
6 References
423
CONTENTS
Chapter 1
Introduction and Summary

Asset Management (AM) is one of the most fascinating disciplines in the field of financial
intermediation. Assets and their management represent a key function of the modern
economy. AM is a process of constructing, distributing and maintaining over the life
cycle compliant assets cost-effectively. This process is used in the whole pension fund
system, and in wealth management for individual investors, enterprise management, and
public asset management.
A large part of the following chapters is devoted to the study of financial assets. Examples are cash, bonds, stocks, commodities, currencies, (interest) rates, credit, derivatives
and options of all types.
1.1
Game Changers
PwC (2015, 2012) identifies, among others, the following game changers for the asset
management field:
Growth of wealth: Global assets under management (AuM) will exceed USD 100
trillion by 2020, up from USD 64 trillion in 2012 (other consulting firms estimate
similar figures using their own models).
Regulation: Asset management moves center stage. Historically, banks dominated
the financial industry. They have been the innovators. Insurance companies also
attracted large asset flows. But regulation focused with an avalanche of regulatory
initiatives on banks and insurers after the Great Financial Crisis (GFC) 2008 while
AM firms face much less regulatory hindrance.
Longevity and demographics: As parts of the world age, retirement and health care
will become critical issues. The old-age dependency ratio, which is the ratio of retired persons to the working-age population, will reach for the next generation 25.4
percent in 2050, up from 11.7 percent in 2010. Asset managers will therefore focus
on long-term investments as well as on asset decumulation customized offerings
9
10
CHAPTER 1. INTRODUCTION AND SUMMARY

for their clients. This change affects particularly the US, Japan, most European
countries, South Korea, Singapore, Taiwan and China. But also longevity continues to rise. This will increase the costs of health care and of the care of the elderly.
Summarizing, AM clients will need to save more to pay for that care.
Distribution of AM services will be redrawn. The economies of scale force to a global
distribution on global platforms and on the other side, the increasing compliance
complexity strengthen regional platforms.
Fees will continue to decrease for most asset management solutions and regulation
requires to transform many existing fee models.
Alternative investments transform into traditional ones and exchange traded funds
(ETFs) continue to proliferate.
Hence, the asset management process is currently undergoing a radical change. The main
immediate driving forces are wealth growth, regulation and technology. Less immediate
is the demographic change. The climate change is missing in the above list of game
changers and is also out-of-scope in these notes. There is no explicit relation to the performance of investments in the above list: How is USD 1 best invested? But for investors
performance of their invested capital has top priority. The method how to invest has a
priori nothing to do with technology or regulation. Lord Keynes for example achieved
over 19 years an excess return over the S&P500 of 17% per annum. But in the 20s and
30s of last century technology and regulation played no role compared to the present.
This performance was realized by using economic analytical competence only.
The discussion of investment methods remains a main issue to be considered. Different to Keynes area, todays technology makes possible new approaches of how to
invest. Such potential links between technology and investment methodology - the robo
advisor for example - are of the same importance as technology seen in the context of
process efficiency, changing the market infrastructure and data integration. Therefore,
the discussion of investment methods is a main issue.
1.2
Regulation and Technology
While regulation was the focus of many financial intermediaries following the Great
Financial Crisis (GFC), the changes brought about by technology are recognized by a
majority of asset managers as being at least as important for the future of the industry.
Technological changes in the financial industry are often encapsulated by buzzwords
such as FinTech and big data. Traditional asset managers have to face competition from
new entrants - firms with huge technological know-how, which could act disruptively.
This means, they could take over parts of the asset managers value chain. Comparing
regulation and technology on a high level of abstraction we note:
1.3. FUNDAMENTAL ISSUES IN AM
11
Technology is irreversible while regulation is not. Regulators could revoke any

regulatory rules. But technology which proves useful to the people cannot be
stopped - how to stop the use of iPhones?
Technology has overall a positive connotation - it improves the circumstances of
living and it is creative. Regulation, despite its goals to make the financial system
safer and to protect customers for example, fails to be seen in the same way.
Why are FinTech and big data en vogue at now? We recall that technology played always
an important role in the last decade in financial intermediation. But this time there are
differences. First, financial intermediaries have to adopt a stronger classical industry
perspective to improve or keep profitability of their business due to the increasing costs
and decreasing margin revenues for many products and services. Second, technology
has matured to a level where abstract banking and asset management products can
be understood, explored and valued by customers in entirely different ways as in the
past. The present technology is closer to humans than past computers were. It would
be interesting to know how many individuals would rank their iPhone among their best
friends. Third, technology has the capacity to replace humans even for complex activities
in the value chain. Fourth, digital-natives just start to consider the management of their
assets.
1.3
Fundamental Issues in AM
In Chapter 2 several fundamental issues in asset management are discussed. We introduce some investor types - sovereign wealth funds, pension funds, and private investors and their investment motivations. Then the growth in global wealth is explained, defining the demand for asset management services.
Who decides? The question regarding different types of decision-making is discussed
in Section 2.3. The discussion is structured along the lines of investment suitability and
appropriateness in the light of MiFID II regulation. The long remainder of the chapter
is about investment methods and portfolio construction. We start with an empirical
discussion of the foundations of investment decisions and explores some basic facts about
statistical and heuristic models, the risk and return relationship, notions of diversification, concentration and diversity, and risk scaling.
A challenge related to diversification measured by correlation is its time variability.
Two risks which are only weekly dependent today may move almost uniformly in a period
when markets are under stress. This evaporation of diversification is illustrated by the
traditional 60/40 static portfolio. The discussion of how to consider appropriately the
dependence of assets is part of the asset selection and asset allocation processes which
define the first two steps in portfolio construction:
Asset selection.
12

Asset allocation.
Asset implementation.
The asset selection and their grouping define the opportunity set of the investor. The asset allocation means how much capital is invested today and in the future in the selected
assets. At this stage, investment is still theory. The implementation issue maps the asset
allocation into trades. We focus on the first two aspects of portfolio construction since
writing in details about asset implementation would mean to write a handbook about
AM which depends on the specific markets, products, product wrappers, tax issues and
legal constraints.
The selection and grouping of assets not only defines the opportunity set. It also
defines the level of risk aggregation. We discuss the pros and cons of the grouping in
asset classes versus risk factors. The conclusion I draw is that asset grouping is switching
away from the traditional asset classes to risk factors.
The asset allocation methods can be divided into optimal investment rules and heuristic ones. We call heuristic any approach which is not developed from first economic
principles. Besides ad hoc rules, risk parity, risk budgeting and big data based models
are also heuristic ones. The main lesson in the discussion is: There is no singly method
in asset allocation which dominates all other ones from an investors perspective. As a
first portfolio construction model, the static 60/40 investment rule in equity/bonds is
considered and the weakness of this simple rule in turbulent markets is discussed. This
shows that traditional asset classes turn out to be the wrong level of risk aggregation.
This motivates the introduction of factor models.
Section 2.5 introduces to the classic Markowitz model as a first model where a mathematical optimization is used to select portfolios. I assume that this model is known to
the reader. We therefore emphasize the intuitions of the model in the text and delegate
the formal aspects to the exercises. We compare the model with several other models,
including risk parity, equal weights, and market weights. We discuss some problems of
the Markowitz model: the estimation risk of the covariance and returns and the stability
properties of the optimal portfolios. We conclude the section with so-called risk-based
portfolio construction - that is to say, portfolios that are not based on an optimization.
The section offers introduces to risk-based portfolio construction.
Factor investing is presented in Section 2.6. We start with the one-factor capital
asset pricing model (CAPM), and continue with the Fama - French three-factor model
and their more recent five-factor approach. Then, the factor-investment offering of large
asset management firms is discussed.
Section 2.7 introduces to portfolio construction where investment views are incorporated in a theoretical models setup. We discuss the Black - Litterman model and the
1.4. INVESTMENT THEORY SYNTHESIS
13
entropy pooling approach of Meucci. We conclude with some practical issues concerning
the role of the Chief Investment Officer (CIO) in the investment process.
The last section in Chapter 2 considers the simple investment model of Faber, originally presented in the most downloaded paper in the largest social science research
network in the world. How significant are the results of this model?
Why do we place so much emphasis on the different portfolios construction methods?
Portfolio construction methods that are 60 years old (Markowitz) are still being used
today. Despite its many weaknesses, these classic models keep their currency. Many new
portfolio construction use the same methodologies as the classic ones. Optimal portfolio
construction is a very demanding task in economics since such a construction has to
take into consideration changing investment opportunity sets and different behavior of
investors.
These raises doubts whether optimal statistical models are appropriate at all and
whether one should not use in AM heuristic methods instead. The heuristic approach is
radically different from the statistical one. Heuristics are methods used to solve problems
using rules of thumb, practical methods, or experience. Heuristics need not be optimal
in a statistical modelling sense.
Heuristics are often seen as a poor mans concept. But when a statistical model
approach is flawed - lack of data, estimation risk, complexity of a decision for example
- heuristic approaches are meaningful. The Markowitz model for example provides the
investor with an investment strategy that is often valued to be too sensitive - that is
to say, small variations in data input parameters lead to large changes in the optimal
portfolio output. Heuristic thinking is, then, often imposed on these models to obtain
acceptable solutions.
Another reason for the use of heuristics arises if one distinguishes between risk and
uncertainty. These different concepts lead to different behaviors. It is impossible to
transform uncertainty-related issues into risk-related ones and vice versa. According to
Knight (1921), risk refers to situations of perfect knowledge about the probabilities of
all outcomes for all alternatives. This makes it possible to calculate optimal choices.
Uncertainty, on the other hand, refers to situations in which the probability distributions
are unknown or unknowable - that is to say, risk cannot be calculated at all. Decisionmaking under uncertainty is what our brain does most of the time. Known risk situations
are relatively rare in real life decision making. We discuss these issues in Section 2.4.
1.4
Chapter 3 explains portfolio construction starting from first economic principles. The
reason for reconsidering portfolio construction from an economic theory perspective is
14
the need to understand whether a particular investment approach is sustainable. This

means for example that for each investor who buys a stock there must be another investor
who sells the stock. An investment strategy which does not fits into the demand and
supply of investors cannot be sustainable.
The core economic theory for investments is asset pricing and modern portfolio theory.
We first consider the general setup and then - in Section 3.2 - discuss the fundamental
asset pricing equation for absolute pricing, which holds in equilibrium. But the theory
also makes clear predictions about the optimal portfolio and the rebalancing of that portfolio, both of which are discussed in Section 3.3.
This economic theory approach is compared with the empirical method of portfolio construction such as the Fama-French models. While the economic theory approach
makes clear predictions its empirical performance is poor due to the non-observability
of key variables in the theory. The empirical methods often perform better but there is
difficulty to explain why they work, see the discussion below.
We conclude that there is no such thing as a general accepted investment theory. In
fact, there is still an increasing number of theories and empirical models which are competing. We discuss in particular the zoo of risk factors and the difficulties for investors
to differentiate between facts and fantasies, see Section 3.5.
We then discuss differences between short-term and long-term investments. The herding behavior of pension funds as an example of long-term investor behavior is discussed
in Section 3.6.
Section 3.7 considers a further key economic concept for asset management: To what
extent can we predict future asset returns? The answer to this question has far-reaching
consequences for the value and meaningfulness of the different asset management approaches used in the industry. If market returns cannot be predicted in a statistical
sense, then active management adds no value.
The efficiency of markets is applied to real-estate risk in Section 3.9. Besides absolute
pricing, relative pricing - or no arbitrage theory - is a second fundamental pricing model.
Relative pricing which is the theory to price derivatives needs much less input data than
absolute pricing where consumption, opportunity sets and preferences of investors are
needed. In relative pricing, one assumes that investors prefer more money to less and
that free lunches are not possible in financial markets. As an example of a relative pricing
approach we review arbitrage pricing theory (APT).
We conclude with a proposition which relates the different notions in relative and
absolute pricing.
1.5. GLOBAL ASSET MANAGEMENT INDUSTRY
1.5
15
Global Asset Management Industry
Chapter 4 considers the global asset management industry. Section 4.2 provides an
overview of the AM industry from different perspectives. As a summary, the valuation
and market capitalization of asset management firms compared to banks and insurers
between 2002 and 2015 is as follows (McKinsey (2015)):
Market capitalization indexed to 100 in 2002 increased for AM to 516, for banks
to 313 and for insurers to 231.
The P/E (price-earning) ratio of AM firms is 16.1, for banks 11.3 and for insurers
14.8. Some figures for the AM industry in the same period are:
The global annual AuM growth rate is 5% . The main driver was market performance.
The growth of AuM is 13.1% in Europe, 13.5% in North America and 226% in
emerging markets (money market boom in China).
The absolute value of profits increased in the same period in Europe by 5% ,
29% in North America and 79% in the emerging markets. Profit margins as the
difference between net revenues margin and operating cost margin are 13.3 bps in
Europe, 12.5 bps in North America and 20.6 bps in emerging markets. The revenue
decline in Europe is due to the shift from active to passive investments, the shift
to institutional clients and the decrease in management fees. The revenue margin
in the emerging markets is only slightly lower in 2014 compared to 2007 but the
increase in operating cost margin is significant. The absolute revenues in China,
South Korea, Taiwan are almost at par with the revenues in Japan, Germany,
France and Canada.
Retirement and defined contribution pension plans grew globally with a Compounded Annual Growth Rate (CAGR) of 7.5
Some observations on the product level in the same period as above are:
The growth rate of passive investments is larger than for active solutions. The
cumulated flows are 36% for passive fixed income and 22% for passive equity.
Standard active management is decreasing for standard equity strategies.
Active management of less liquid asset classes, or with more complex strategies, is
increasing. An increase of 49% cumulated flows for active balanced multi asset is
observed.
The increase of alternatives is 23% in cumulated flows.
16

Clients
Pension funds
Insurance companies
Sovereign wealth funds
HNWIs
Mass affluent
2012, USD tr.

33.9
24.1
5.2
52.4
59.5
2020, USD tr.

56.5
35.1
8.9
76.9
100.4
Growth rate p.a.

6.5%
4.8%
6.9%
4.9%
6.7%
Table 1.1: Expected AuM growth until 2020 ((PwC [2014]).

While actively managed funds growth is driven by a growing middle-class client base,
institutional investors and HNWIs are the driving forces for the mandate growth.
Table 5.3 summarizes some expectations about the AuM growth until 2020. The
subsequent sections present mutual funds, SICAVs, index funds, and ETFs. We restrict
the discussion of alternative investments to insurance linked investments in Section 4.5.
The definition, role, and properties of hedge funds are discussed in Section 4.6.
We compare in Section 4.3 the US Mutual Funds versus European UCITS. The main
results are that cross-border distribution worldwide has been most successful within
the European UCITS format, that both, UCITS funds and mutual funds, originally
were quite restrictive in their investment guidelines but now started to use derivatives
extensively (newCITS). A further difference is that US clients invest in existing funds
while European investors are regularly offered new funds leading to a decreasing number
of US mutual funds and a strong increase in European funds. Finally, this tendency to
innovate permanently in Europe, European funds are on average around six-times smaller
than their US counterparts.
We try to give a decisive answer to the importance of luck and skill in active management in Section 4.3. Scaillet et al. (2013) consider 2,076 actively managed US open-end,
domestic equity mutual funds between 1975 and 2006. They find that after costs, only
0.6 percent can be considered to be skillfully managed. Furthermore, the proportion of
skilled funds decreases from 14.4% (1990) to 0.6% (2006). Their analysis also considers
different fund characteristics and their relation to skill and luck.
The analysis for a large sample of European funds deliver similar results whereas for
the sample of hedge funds some differences compared to mutual funds appear. Some key
figures for hedge funds (HF), see Section 4.6, are:
Its size in 2014 was USD 2.85 trillion versus USD 2.6 trillion in 2013.
The average growth in HF assets from 1990 to 2012 was roughly 14 percent per
year.
The decrease in AuM after the Great Financial Crisis (GFC) in 2008 was fully
recovered six years later.
1.5. GLOBAL ASSET MANAGEMENT INDUSTRY
17
In the years 2009 to 2012, HF performance was lower than the S&P 500.
Given the survivorship and backfill biases, why do databases not correct in a transparent and standardized form for these biases when publishing their data?
Given the many biases and the high fee structure, why is regulation for HF financial intermediaries much less severe than for banks, asset management firms, or
insurance companies?
There are several facts that limit the alpha of the HF industry: The number of HF
managers has increased from hundreds to more than 10,000 in the last two decades,
markets are becoming more efficient and one often finds an inverse relationship
between the size of a successful hedge fund and its managers ability to create
alpha.
While the alphas of the HF industry have been decreasing steadily in the last two
decades, correlation with broad stock market indices such as the S&P 500 shows
the opposite evolution.
Summarizing cost for different type of AM products we note that fees for mutual funds,
ETFs and hedge funds are still decreasing, see table 1.2.
Mutual funds (*)
Index funds (*)
ETFs (**, )
Equity
0.74%
0.12%
0.49%
Bonds
0.61%
0.11%
0.25%
Table 1.2: Fees p.a. in bps in 2013 ((*) Investment Company Institute, (**) Lipper;
DB Tracker; (*) Barclays; BlackRock).
The whole discussion about how to invest has so far not considered the case of how
investors can act if market opportunities are generated due to specific events. This means
that opportunistic investment replaces the portfolio investment approach. The wrapping
of the asset management solution is then not a mutual fund or an ETF but products that
have a very short time-to-market and that precisely match the investors view when such
an event occurs - that is to say, derivatives and structured products are the wrappers in
such situations.
We consider the event of January 15, 2015, when the Swiss National Bank removed
the floor for the EUR-CHF rate, and the several investment opportunities thus created.
Some of these opportunities lasted only a few days. The next two sections focus on the
investment process and technology. We try to make clear what FinTech and big data
mean and how they will affect the asset management industry.
Investments in FinTech raised in the period 2010-2014 up from USD 1.4 billion in
to USD 9.1 billion. The survey of McKinsey (2015) for the sample of more than 12000
FinTech startups states:
18

Target clients: 62% of the startups target private customers, 28% SMEs and the
rest large enterprises.
Function: Most startups work in the area of payment services (43% ) followed by
loans (24% ), investments (18% ) an deposits (15% ).
The Future of Financial Services (2015) paper, written under the leadership of the
World Economic Forum (WEF) identified 11 so-called clusters of innovation in six functions of financial services. The most important functions for AM are market provisioning
and investment management and the innovation clusters of interest to AM are new market
platforms, smarter & faster machines, shifting customer preferences, process externalization and empowered investors. Besides describing the status of FinTech we propose a
generic basis which allows us to understand the many facets of the innovations by using
only a small number of building blocks.
A challenge is to define what the buzzword big data really means. Big data can
be seen as a two-step process: First, raw data are transformed into model variables by
averaging, integrating, aggregating, conditioning, or creating new classes in the original
data set. The second step is to make a prediction based on the first step.
What will be the impact of the FinTech and big data revolution on the employees in
the asset management industry? Hal Varian, Chief Economist at Google, stated, in 2009:
I keep saying the sexy job in the next ten years will be statisticians. People think Im
joking, but who wouldve guessed that computer engineers wouldve been the sexy job of
the 1990s?
The disruptive nature of big data led him to conclude in 2013:
I believe that these methods have a lot to offer and should be more widely known and
used by economists. In fact, my standard advice to graduate students these days is go to
the computer science department and take a class in machine learning.
Whatever the realized changes will be in the future, the impact for the employees are
clear: Less people will work in the AM industry (and in general in the financial industry),
the skills will switch towards a combination of computer science and mathematics and
also demanding activities will be automatized.
The remaining sections consider, or reconsider, two trends: demography and the
uniformity of minds. One trend is missing - investments considering the climate change.
This omission has nothing to do with the importance of this trend but with my ignorance
to discuss this issue. Several other asset management issues are not considered including
a detailed description of the custody function, execution, client reporting, structuring in
mutual funds, a detailed discussion of tax issues and of cross-border asset management.
1.6. VARIA
1.6
19
Varia
The target readers for these chapters are students who have completed a Bachelors
degree in Finance or Economics. I have chosen to avoid a too formal presentation. The
proofs of all propositions are discussed in the exercises and there is a large number of
theoretical and empirical exercises. The exercises, the data sheets and the solutions are
available on request.
I am grateful for the assistance of Dave Brooks and Theresia Bsser. I would like to
thank Sean Flanagan, Barbara Doebeli, Bruno Gmr, Jacqueline Henn-Overbeck, Tim
Jenkinson, Andrew Lo, Helma Klver-Trahe, Roger Kunz, Tom Leake, Robini Matthias,
Attilio Meucci, Tobias Moskowitz, Tarun Ramadorai, Blaise Roduit, Olivier Scaillet,
Stephen Schaefer and Andreas Schlatter for their collaboration, their support or the
possibility to learn from them.
20
Chapter 2
Fundamentals
The two words in the expression Asset Management (AM) require explanation: What
do we understand by an asset, who manages the assets, and how is this done? An asset
is any form in which wealth can be held. Asset management is a systematic process
of analyzing, trading, lending, and borrowing assets of any type. Since all assets belong
to someone, the management of assets results from a decision regarding the investment
strategy - a decision made by the owner of the assets or by a third party. McKinsey
(2013) estimate that third-party asset managers managed one quarter of global financial
assets worldwide in 2013. The main outsourcers are pension funds, sovereign wealth
funds, family offices, insurers, banks and private households. The third-party managed
portfolios are of two types - investment funds or discretionary mandates. Funds pool
assets with a specific risk level into which one can buy and sell shares (mutual funds,
ETF, structured notes or hedge funds). In a mandate the owner of the assets delegates
the investment decision to the asset manager.
The asset management function can organized on a stand-alone basis as independent
firms or the function is part of a bank or an insurer.
The key expression above is systematic process. By way of a first remark, there is no
such thing as a single, accepted systematic asset management process. Different existing
approaches compete with one another and new approaches are continually developed.
But the different approaches all serve the same basic function of asset management companies to channel assets from savers to investors. Two functions of the AM process are of
particular importance for successful investment: the investment method and technology.
The former is related to academic investment theory or more generally to the principals
of financial economics. The latter has attracted much less attention in the past. But
the new developments referred to as FinTech, big data, digitalization, etc. are this view
radically.
The goal of investment is to save today for the benefits of future consumption. The
utility of consumption for an investor is expected to be larger after an investment pe21
22
CHAPTER 2. FUNDAMENTALS
riod than utility derived from immediate consumption of all resources. Investments are
mostly implemented by using tradable assets of any kind - that is, money, shares, bonds,
ETFs, mutual funds, or derivatives. Securities are initially issued through financial intermediaries (the primary market) and they can often be re-sold on the secondary market.
They can differ in their ownership rights, complexity, liquidity, risk and return profile,
transaction fees, accessibility, etc. Securities are often also referred to as financial assets.
Regarding investment decisions, the price and price forecasts of securities are particularly important. There are two methods used to price assets: absolute pricing as an
equilibrium outcome in an economy and relative pricing using the concept of no arbitrage. We consider the pricing issue in Chapter 3.
To summarize, the four key questions in AM are:
1. Who decides? - the decision responsible question.
2. How do we invest? - the investment method question.
3. Where do we invest? - the asset selection question.
4. How are asset management services produced and distributed in different jurisdictions? - the profitability, process, client segmentation, regulation and technology
question.
In the past, technology was necessary to implement theory, such as in portfolio construction, where one needed to use statistical programs to estimate a models parameters.
The new technologies - FinTech and big data - not only make known theory work in practice, they also make it possible to define new investment approaches, which are entirely
different to classic statistical models like that of Markowitz, the capital asset pricing
model (CAPM) or Black-Litterman. Furthermore, technology is the key factor to scale
the business and to master regulatory complexity.
The above question 4) attracts a large part of the asset management resources in
the decade after the GFC due to the regulatory and technological changes and also to
different client expectations. This question can be considered on a finer scale as the sum
of the following strategic business questions (UBS [2015]):
In which countries does an AM firm want to compete in? This geographical focus
follows from value the AM firms actual strength, its potential, the costs to comply
with the country specific regulation, the costs to build up the human capital and
the additional business and technological complexity due to the engagement in the
country.
Which clients should be served?
Which products and investment areas should the AM firm focus on? Often large
AM firms offer up to several hundred investment strategies. If is necessary to focus
on a subset.
2.1. RETURNS
23
How should the clients be served? This question asks for the definition of services
and the technology used for the services.
What operating model should be used? This question has a distribution dimension
(global vs. (multi)-local offering), an operational one (centralized vs. decentralized), a value-chain one (in-house vs. outsourcing) and a legal/tax environment
one (on-shore vs. offshore).
2.1
Returns
Returns are key in asset management for the calculation of risk and performance. The
calculation of return is not as straightforward as one might guess. One needs to calculate
returns for arbitrary complicated cash flow profiles where cash be injected or withdrawn
at different time dates. Risk models are needed to value the returns at different time
points of the risky cash flows. Finally, the return for an investor is possibly the result
of several money managers: Returns should be decomposable to account for different
contributors.
2.1.1
Time Value of Money
Consider two discrete time cash flow streams C, C0 with fixed or stochastic cash flows in
a currency. Which cash flow stream is preferable? Without any reduction of information
complexity, the answer to this question is difficult. The reduction of information complexity is achieved by mapping the cash flows into a single number such as the present
value (PV) for example.
Cash flows at different dates cannot simply added since CHF 1 today is not equal
in general to CHF 1 at any other date - there is a time value of money.
The microeconomic assumption of impatience of investors rationalize why there is a
time value of money. This leads to interest rates, Consider the consumption of a good c
at time s or at t > s. Typically, investors prefer consumption earlier to later, i.e. they
have the preference
u(cs ) u(ct )
where u is the utility function to value consumption. To make the investor indifferent,
the consumption good at time t must be larger than at time s, i.e.
u(cs ) = u(ct + ) =: u(ct (1 + Rt ))
with the interest and Rt the interest rate or growth factor to compensate for impatience. The time value of money is expressed by the discount factor D(t, T ) =
D(T t), T t, as the difference between two dates. To understand this property
consider CHF 1 at time T . We discount this cash flow directly back to a time t or we
24
discount it first back to a time s, t < s < T , and then from s back to t. We assume that
there is no risk. Then, the value at t of the Swiss franc should be the same independent
of the chosen discounting path - else there is a possibility to make risk less arbitrary
profits. Formally,
D(t, s)D(s, T ) = D(t, T ), D(t, t) = 1 .
(2.1)
Cauchy proved that there is a unique continuous function - the exponential function which satisfies (2.1):
D(t, T ) = ea(t,T )(T t) , R > 0 .
This motivates exponential discounting. Calculating the growth rate of the discount factor,
D
T
= a(t, T ), identify the function a(t, T ) with the interest rate R(t, T ).
The no arbitrage relation (2.1) relates different time value of money systems. Consider
for example spot and forward rates. Then the no-arbitrage condition DS (t, s)DF (t, s, T ) =
DS (t, T ) uniquely relates the spot rate term structure and forward rate one - given one
term structure, the other one follows by no arbitrage. This also holds if one considers
swap or LIBOR rates for example.
The specific date t chosen for discounting is not relevant since only the difference
T t matters. The investor can choose any date to compare different cash flows. The
inverse operation of discounting is called compounding, i.e. valuing cash flows in the
future. Since
D(t, T )D(t, T )1 = D(t, t) = 1
the function D(t, T )1 defines compounding from t to T . We denote by PV the present
value and by FV the future value of a cash flow stream.
The absence of no arbitrage implies that there exists exactly one discount factor for
each currency. But there are many different forms of interest rate, profit and loss and
performance calculations. The reasons are
The method of compounding - do investors reinvest their proceeds in future periods
or do they consume them (simple compounding),
The interest rate or term structure chosen - do we use market rates such as spot,
Swap or forward rates from a pricing and trading perspective or do we use synthetic
rates from an asset management perspective such as the yield-to-maturity (YtM)
to value and compare different portfolios
The calender and day-count-convention chosen are third dimension which leads to
different interest rate or P&L calculations.
Consider the P V0 and F V1 . Then,
P V0 = D(0, 1)F V1 = eR(10) F V1 = eR F V1 .
2.1. RETURNS
25
Therefore,

R = ln
F V1
P V0
If we consider short time periods such

asfor daily return calculations, then the logarithm
reads up to first order expansion ln PF VV10 PF VV10 1. This implies the gross simple return
expression
F V1 P V0
R=
(2.2)
P V0
which then defines the simple discounting function D =
1
1+R .
Remarks:
Interest rates are quoted on a p.a. basis.
How are different discount factors related? The continuous discount factor Dc =
eRc (T t) , the discrete time discount factor Dd = (1 + Rd )T t and the simple
discount factor Ds = (1 + Rs (T t))1 , they all have to attribute at any chosen
time t the same present value to a future CHF 1 cash flow. The different rates
Rc , Rd , Rs cannot be chosen independently but given one rate, the other ones follow.
Simple discounting is used for LIBOR rates, products with maturity less than a
year, discrete compounding for bonds and continuous compounding for derivatives
or Treasury Bills.
The discount function is a simple function of the interest rate. But the interest rate
itself is a complicated function of a risk free rate, the creditworthiness of counter
parties, liquidity in the markets etc. The discount function construction is the
key object in financial engineering and a whole industry developed and develops
methods to construction the discount function D(t, T ) for different maturities T the so-called term structure.
Example
Let p(t, T ) the price of a zero-coupon bond (ZCB) at time t paying USD 100 at
maturity T if there is no default. Except from counter party risk, a ZCB is the same
as a discount factor. ZCB are the most simple interest rate products. More complex
products such as coupon paying bonds can be written as a linear combination of ZCBs.
Consider a coupon bond with a yield R, i.e. the rate needed such that the PV of the
bond is equal to its present price is R. The slope of the price-yield graph has a negative
slope since a bond issued today will have a lower price tomorrow if the interest rates
increase (opportunity loss).
A key issue in the calculation of the performance in asset management is to distinguish
the case whether future cash flows are reinvested (compounding) or whether the cash
26
flows generated in the future subperiods are consumed (simple compounding). It follows
in n years in the case of compounding, respectively simple compounding:
F Vn = P V0 (1 + R)n , F Vn = P V0 (1 + nR) .
(2.3)
Hence, the future value with compounding is never less than the value with simple compounding. This formula can be generalized to the case with several sub-annual periods
for the interest rate calculations and where R is not constant but a function of the periods. The limit forward value is achieved if interest rates are calculated instantaneously
which result in the exponential compounding formula as limit how fast capital can grow.
Example Effective rate of return and Yield-to-Maturity (YtM)
The simple effective rate of return is used Re to compare cash flows with different maturities and length of sub-annual interest rate periods. Consider first a n-years
investment with compounding. The equations
F Vn = P V0 (1 + R)n , F Vn =: P V0 (1 + Re )
imply Re = (1 + R)n 1. Consider Re for a n-year investment in a stock S. We then
have
n
Y
Sn
Sn Sn1
S1
1 + Re =
(1 + Rk,k1 )
=
...
=
S0
Sn1 Sn2
S0
k=0
where Rj,j1 is the sub-period return. The effective, simple gross return is equal to the
product of the period returns with discrete compounding. If discounting is continuous,
the effective return is equal to the arithmetic sum of period returns:
n
P
Rc
1 + Re = e
(1+Rc,k,k1 )
:= ek=0
This is one reason why continuous compounding is preferred.

A particular decision problem for an investor is to choose between two bonds:
Bond 1: Price 102, coupon 5%, maturity 5 years.
Bond 2: Price 98, coupon 3%, maturity 5 years.
Bond 1 has more attractive future cash flows due to the higher coupons but bond 2
is cheaper. Which one to prefer? Intuitively if the maturity would increase, all other
parameter unchanged, then bond 1 should become more profitable and the opposite holds
if the price of the bond 2 become more and more cheaper compared to the bond 1. The
yield-to-maturity (YtM) is decision criterium which assumes:
Products are kept until maturity.
2.1. RETURNS
27
Interest term structure is flat, i.e. YtM is not a market rate.

Then for both bonds the YtM y solves the equation:
Price =
n
X
j=1
N
c
+
.
j
(1 + y)
(1 + y)n
The bond with the higher resulting y is the preferred one. For already low values of
n, there is no explicit analytical solution of this equation but numerical solutions are
available. YtM is the most important example of a Money-Weighted Rate of Return
(MWR), see below.
2.1.2
Returns and Return Attribution
Consider a finite discrete time model 0, 1, 2, . . . , T , B0 , the risk-less asset, normalized

to S0 (0) = 1 and N risky investment opportunities with known prices Sj (0) 0, j =
1, . . . , N , at time 0. Sj (T ), j = 1, . . . , N at time T are non-negative random variables.
The value process
V (t) = 0 S0 (t) +
N
X
j Sj (t) =: h(t), S(t)i ,
(2.4)
j=1
with 0 the amount of CHF invested in the saving account, j the number of units of
the risky security j held in the period and h, Ri the inner product notation.1
Definition 2.1.2. The vector (0 , 1 , . . . , N ) is called the portfolio (or [investment]
strategy). A normalized portfolio at time t is defined by
(t) = (0 , . . . N )(t) , 0 (t) =
0 (t)S0 (t)
k (t)Sk (t)
k (t) =
, k = 1, . . . , N(2.5)
.
V (t)
V (t)
The following properties (portfolio accounting) are immediate to check:

Proposition 2.1.3.
to 1.
1. The normalized portfolio components without leverage adds up
2. The return of a portfolio is equal to the weighted sum of the portfolio constituents
return:
R = 0 Rf +
N
X
j Rj =: h, Ri .
(2.6)
j=1
1
We recall the definition of an inner product:
Definition 2.1.1. Let X be a vector space. A map h., .i : X X R is an inner product if it is a

linear map in both arguments, symmetric and positive definite.
28
with Rf the simple risk free return.
3. If the portfolio is self-financing, then the change of portfolio value at a given period
is equal to the sum of the portfolios asset value changes from initiation time to the
present time.
The last property requires some explanation. In stylized form V = S is the portfolio
value. Writing Xt = Xt Xt1 for the difference operator, the change in portfolio value
reads with the product rule:
V = S + S .
The first term on the RHS means that the change in portfolio value between two dates
is due to a change in the strategy vector - external money is added to the portfolio or
withdrawn. Self-financing means that one wants to rule out strategies where one needs
additional funds to create values. The change in value should arise only due to changes
in asset prices - the second term on the RHS.
Let Vt be the value of a portfolio at time t. Then the simple return
R=
VT V0
V0
is invariant of the size of the portfolios: Multiplying the portfolios by a scaling factor, the
factor cancels out in the return calculation. Hence, we can set without loss of generality
V0 = 1.
Armed with this notations, we consider as a first application the Arithmetical Relative
Return (ARR) defined as the difference between a portfolio return RV and a benchmark
return Rb :
X
ARR = RV Rb =
(j RjV bj Rjb )
(2.7)
j
Figure 2.1 shows how this return can be split into three different parts for each j:
ARRj = j RjV bj Rjb = 1 + 2 + 3
with 1, 2, 3 the rectangles. Using elementary geometry
ARRj = 1 + 2 + 3 = (j bj )Rjb + (RjV Rjb )bj + (j bj )(RjV Rjb ) .
|
{z
} |
{z
} |
{z
}
=:A
=:S
(2.8)
=:I
A represents the tactical asset allocation (TAA) effect which is also called the BrinsonHood-Beebower (BHB) effect, S the stock selection effect and I the interaction effect.
The tactical component is the chosen weight difference between the portfolio and the
benchmark valued with the benchmark return and similarly, stock selection is given be
the return difference between the two portfolios weighted with the benchmark portfolio
weights. This decomposition is called the management effect and it also holds on the
2.1. RETURNS
29
Benchmark
Weight
Portfolio
1
jj
jjb
Rjb
Return
Rj
Figure 2.1: Arithmetic return decomposition. Source: Adapted from Marty (2015)
whole portfolio level. This methodology of the return decomposition was invented by
BHB in 1986 and it is used by many asset management as a starting point for their
performance attribution.
Figure 2.2 shows the performance attribution tree for the MSCI World ESG Quality
Index. The total return RT can be written in the form
RT = RT Rb + Rb = ARR + Rb .
Since fees are not available, the total return is a gross return. The ARR has several
levels:
First, ARR is decomposed in asset classes where the asset class equity is then further
decomposed into three types: Sector and geographical diversification G, the selection
part S and a part which invests into a portfolio of factor risk premia. Formally:
ARR = non-EQ + EQ
X
=
j Rjb + EQ Rb,EQ
j6=EQ
j Rjb + G + F + S
j6=EQ
X
j6=EQ
j Rjb +
X
kSectors
b,EQ
EQ
+
k Rk
X
kRisk Factors
k Rk + S .
30
RT
=RB + ARR
Fees: net of fee return
Asset Classes
TAA & Selec.
Risk Premia
Figure 2.2: Performance attribution tree for the MSCI World ESG Quality Index where
the information written in red comes from me (Adapted from MSCI [2016]).
There are two methods to calculate the investment return: Time-Weighted Rate of Return (TWR) and Money-Weighted Rate of Return (MWR) methods. We only provide
some basic results and refer to Marty (2015) for a detailed discussion. The TWR should
measure the return of an investment where possible in- or outflows do not affect the
return of the investment. TWR should therefore reflect the return due to the asset managers decisions taken in the past. MWR should reflect the return from an investors
perspective: In and out cash flows as well as the profit and loss matter in this perspective. The MWR method is based on the no arbitrage principle. Both, the MWR and
TWR can be applied on an absolute or relative return basis.
T W R of a an investment starting in 0 and ending in T , with T 1 time
The TWR R0,T
points in between (not-necessarily equidistant) where the portfolio V subperiod return
Ri,i+1 is calculated in each subperiod is defined by:
1+
TWR
R0,T
TY
1
(1 + Ri,i+1 ) =
i=0
where hi , Si+1 i :=
TY
1
i=0
PN
j=1 i,j Si+1,j
Vi+1 Vi
1+
Vi

=
TY
1
i=0
hi , Si+1 i
hi , Si i

(2.9)
is the value of the N assets with the corresponding
2.1. RETURNS
31
portfolio . The following properties holds for TWR:

Proposition 2.1.4.
change TWR.
1. Adding or subtracting any cash flow ct at any time t does not
2. If i (j) = i i1 (j) for all assets j and all time points i, then TWR equals the
return of the final portfolio value relative to its initial value. Hence, all intermediate
returns cancel in (2.9).
The TWR method is used by most index providers since cash in- or out-flows do not
impact the return of the index. To prove the first property, fix a time t and let ct be an
arbitrary cash flow. The relevant terms in the TWR with this additional cash flow are:

V Vt
V Vt1
1 + t+1
1+ t
.
Vt
Vt1
Assuming that Vt = Vt + ct, i.e. the additional cash flow is added, and inserting this in
the last expression implies after simplifications:
Vt+1
Vt1
which is the same result as simplifying the two terms in the TWR without any additional
cash flows.
In the MWR cash flows are reinvested at the internal rate of return (IRR). This
means that RM W R solves:
P V (C, RM W R ) =
T
X
D(0, j; RM W R )cj
(2.10)
j=1
where the discount factor D depends explicitly on the RM W R . Since RM W R enters the
denominator of the discount factor, (2.10) is solved numerically. Using the first order
1
approximation D 1+R
transforms (2.10) into a linear equation for R- the so-called
Dietz Return (with AIC the Average Investment Capital:
2.1.3
Dietz
P &L
=
:=
AIC
ST S0
TP
1
j=1
S0 +
1
2
TP
1
cj
.
(2.11)
cj
j=1
Returns and Leverage
What can be said about returns if investments are levered? We consider two assets, see
Anderson et al. (2014). The return of this portfolio without leverage R0 in a single
period reads
R0 = h, Ri
(2.12)
32
with , 1 the invested amounts in asset 1 and 2, respectively. The total invested
amount adds up to 1 - there is no leverage. Consider a leveraged position with leverage
ratio 1. The portfolio value in absolute terms reads at any date
V = (1 S1 + 2 S2 ) ( 1)3 B
(2.13)
where the first part represents the levered portfolio V 0 and the last term represent the
borrowing costs for the leveraged position which is an investment in the borrowed asset
B. Calculating the return of such a portfolio we get:
R = h, Ri ( 1)3 RB
(2.14)
with = (1 , 2 ) and 3 = 1 2 . If there is no leverage, 3 = 0 follows. Calculating

the excess return relative to a risk free rate Rf and to the borrowing rate RB we get:
R Rf
= h, R Rf i ( 1)3 (RB Rf )
R RB = h, Ri .
(2.15)
Hence, the excess return relative to the borrowing rate scales linearly in the leverage ratio. But for the excess return relative to the risk free rate, if the return of the borrowing
portfolio is larger than the risk free one, increasing of the leverage ratio reduces the gains
in the original portfolio.
The leverage ratio is in many investment strategy applications not a constant over
time and a random variable at future dates. Rewriting the second equation in (2.15) and
taking expectations we get:
E(R ) = h, E(R)i + E( 1)E(R RB ) + cov(, R RB ) .
(2.16)
Anderson et al. (2014) call the first two terms on the right hand side the magnified
source terms due to leveraging. The last term shows that there is covariance reduction of
a leverage portfolio return. How important is this correction? To quantify this we need
to consider a further correction in multi-period levered investments - the volatility drag,
see Section 3.3.5.
2.2
Investors
There are different types of investors: investors can be retail clients, wealthy private
clients, pension funds, family offices, or state-owned investment funds. The unifying
characteristic of all investors, however, is that they hold a portfolio of securities. But
different types of investors have different investment preferences. We consider sovereign
wealth funds (SWFs), pension funds, and private investors.
2.2. INVESTORS
2.2.1
33
Sovereign Wealth Funds (SWFs)
SWFs are among the largest asset owners in the world. The largest SWF in 2014 was the
Norwegian Government Pension Fund with USD 893 billion in assets. The next largest
by size are all from countries in the Near or Far East: Abu Dhabi, UAE, Saudi Arabia,
China, Kuwait, Hong Kong, Singapore, and Qatar. All manage funds with assets of
between USD 250 billion and USD 800 billion.
Why are there so many SWFs in emerging markets? First, more than 50 percent of
all large SWFs originate in oil. Second, governments in Asia are much more active in
managing their economies than some of their western counterparts. According to Ang
(2014), another reason is that - following the many state bankruptcies of the 80s and
90s - the United States told emerging-market countries to save more. In recent years,
a debate has begun over whether it is productive to accumulate capital in SWFs. Since
many SWFs are domiciled in emerging markets, the question arises whether it would not
be more productive to directly invest the capital in the respective local economies.
Many SWFs accumulate liquid assets as adequate reserves for meeting large, unexpected future shocks to a countrys economy. This defines a long-term precautionary
saving motive for future generations. This motivation is crucial for the acceptance of
an SWF, because an SWF can only exist if it has public support. Public support is
a delicate issue and needs careful treatment. Any scandals caused by the incompetent
management of a fund or a lack of governance caused by an inappropriate integration
of the fund into overall government and economic strategies should be avoided. The
SWF should also be protected against any kind of political mismanagement and against
criminal acts. Finally, any changes in the asset management risk policy should be documented and communicated to the funds owners. The aforementioned Norwegian SWF,
for example, started by investing only in bonds. Only after an extensive public debate
did a diversification of investments
2.2.2
Pension Funds
This section introduce to the topic. A more detailed discussion is given in Chapter 4.
Large pension funds can be managed at the sovereign level, but most are - in contrast to
SWFs - privately owned. Assets managed by pension funds are huge: they vary from 70
percent of GDP in the US to more than 130 percent in the Netherlands.
Why do pension funds exist? Pension funds can offer individuals risk sharing mechanisms that would not be feasible for those individuals operating on a stand-alone basis.
Consider people who are now 30 years old. Assume that they want to protect the capital
which they will receive when they are 70 years old. The financial markets do not offer
capital protected products with 40 years maturity - thus, the markets are referred to as
incomplete. An appropriate pension plan can smooth the risks of todays generation by
buffer stock formation over the next 40 years. A buffer stock formation across different
34
generations means that risk sharing between the generations is defined by the pension
fund scheme. In this sense, pension fund schemes can be seen as complementing the market by adding an infinitely lived market participant, which allows individuals to share
their life cycle-specific investment risk.
Pension funds are only one part of a countrys whole retirement provision system.
These systems are often segmented into three pillars:
Pillar I - This pillar should cover the subsistence level and it is often organized
on a pay-as-you-go basis: Each month, members of the working population pay a
fraction of their salary, which is redistributed immediately to the retired population.
Pillar II - This is the pillar of private or public pension funds. It should suffice to
cover the cost of living for members of the current working population after their
retirement. The owners of the assets have only restricted access to those assets.
There are two types of funds: defined benefit (DB) and defined contribution (DC)
plans. DB plans are based on predetermined future benefits to beneficiaries but
keep the contributions flexible. DC plans predefine the contributions but do not
fix the future benefits. Summarizing, contributions define benefits in DC plans and
benefits define contributions in DB plans.
Pillar III - Privately managed investments, which often exhibit tax advantages.
Access to the asset before retirement is mostly restricted.
Figure 2.3 shows the importance of the different pillars in different countries. Retirement systems are under pressure in most developed countries due to demographic
shifts and increasing longevity. For the first pillar, demographic changes mean that, on
average, a working individual has to pay more, and for an increased number of retired
individuals. This jeopardizes the concept of intergenerational risk sharing.
A further main threat for the first pillar besides longevity and fertility is the hugh
impact on the state budgets. These threats destabilize in many countries the first pillar
system in the next years and hence, many belief that more emphasis will be on the second
and third pillar in the future. The first pillar has in Spain more than 90 percent importance for the retirement income, following ABP (2014). For Germany, France and Italy
the value is between 75 percent and 82 percent. Given the extreme low fertility rates
in Spain or Italy and the hugh unemployment rate among the young people the focus
on the first pillar is not sustainable. Shifts to the second or third pillar are necessary
which defines an opportunity of asset management. In Switzerland, the first pillar has
an impact of 42%. The second pillar contributes for around 20 percent in UK and 40%
in the Netherlands and Switzerland.
In the DB plans of pension funds, the rent is set in relation to the most recent averaged salaries, see Figure 2.3. The contributions are calculated in such a way that they
generate a predefined capital stock at the end of working life. Therefore, an increase in
2.2. INVESTORS
35
Figure 2.3: Left panel: The importance of the three pillars in percentage of retirement
income (ABP [2014]). Right Panel: Basic form of DB and DC pension plans.
salary requires additional funds in order to maintain the full rent. On the other side, a
year with very low income can have dramatic effects for the contributor in the retirement
period. Since, in a DB system, the financing amount can change on an annual basis,
they are considered to be not very transparent.
In DC plans, the fixed contributions are invested in several asset classes and the rent
is only weakly related to the most recent salary of the contributor. The growth of the
invested capital, including interest payments, implies a final capital value at the end of
working life. The conversion rate applied to that final capital level finally defines the
annual rent. Contributors to DC plans - contrary to those who contribute to DB plans
- bear the investment risk. This makes this form of pension system cheaper to offer for
employers. Unlike DB plans, the contributors can at least partially influence investment
decisions - that is, choose the risk and return levels of the investments. This is one reason
why DC plans have become more attractive to contributors than their DB counterparts.
Finally, in some jurisdictions, DC plans are portable from one job to the next, while DB
plans often are not portable.
Underfunding for private or public pension funds is very different. States and municipalities in the US accrued, in 2009, USD 5.7 trillion of retirement liabilities to workers.
Novy-Marx (2011) estimates that the underfunding of public pensions is as high as USD
3 trillion - that is, the value of the assets is at USD 2.7 trillion. In Switzerland, the aver-
36
age funding ratio of private pension funds in 2013 was 107.9 percent (Kunz [2014)]). The
ratio for public funds was 87.8 percent, showing strong underfunding. Private and public
pension funds differ even more severely when comparing the overfunding and underfunding gaps: For the Swiss private sector, there is CHF 16.2 billion of overfunding capital
and CHF 6.4 billion of underfunding. In the public domain, the situation is the opposite:
CHF 1.4 billion of overfunding versus a CHF 49.5 billion funding gap. There has been
a rapid shift from the DB plans common in the 80s to DC systems. In the US, almost
70 percent of pension funds are of the DC type. This is a reversal in percentage terms
compared to the situation 30 years ago. This system change took place in the private
sector more quickly than in the public sector, because the state can rely on taxpayers.
2.2.2.1
DB versus DC Planes
What are the causes for the changes from DB to DC planes? One reason is regulation
which requires a certain level of cover ratio under the Basel Committee proposals and
also under Solvency II for the insurance industry. Furthermore, the accounting standards under IFRS state since 2006 that a shortfall in funding should be accounted for
on companies balance sheet. For DB schemes, shortfalls are financed by the employer
and hence guarantees are on the balance sheet of the companies. By switching to DC
plans where there are no guarantees the balance sheet burden for the companies vanishes.
Another perspective on the DC and DB issue is financial literacy - the ability of
decision makers to understand their investments. By definition, the employees make the
investment decisions in DC plans. But several studies document that a majority of employees want to delegate their investment decision. One reason is their status of financial
literacy. Gale and Levine (2011) test four traditional approaches to financial literacy employer-based, school-based, credit counseling, or community-based. They find that
none of the financial literacy efforts have had positive and substantial impacts. As a conclusion, improving financial literacy should be a concern for policy-makers. Holden and
VanDerhei (2005) test for the ability of employees to diversify their equity investment.
They find that roughly 50 percent diversify their investments while the other half either
does not invest at all in equity or is fully invested in stocks.
Another disappointing perspective related to the transition to DC based plans are
the average undersavings in such plans. Munnell et al. (2014) report that in 2013 around
50 percent of all US households are financially unprepared for retirements in the sense
that retirement income will be 10 percent or more below their pre-retirement standard
of living: The average DC portfolio at retirement holds USD 1100 000 while over USD
2000 000 is needed.
A final view on the DB and DC debate are the costs. The CEM benchmarking (2011)
which considers 360 global DB plans with 7 trillion USD assets find a fee range between
36 and 46 bps. Munnell and Soto (2007) estimate the fees for DC plans between 60 and
170 bps.
2.2. INVESTORS
37
The above transformation of pension funds from DB to DC plans will be challenged

in many countries in the near future by demographic changes and longevity issue:
An AM trend in many countries will be the increasing importance of asset consumption by the baby boomer generation, when they retire in the near future, compared
to previous generations where the main objective was to generate assets. This
change from the previous asset accumulation regime to one of asset consumption
by future generations will have a deep impact on how AM solutions are provided.
The process of asset consumption is inherently an asset liability management issue
and personal to all customers: people have different plans of how they want to
consume their assets during their retirement. The process of asset accumulation,
though, means generating wealth for all contributors in a similar way irrespective
of their liabilities. Therefore, the future customers of pension fund schemes will
demand tailor-made asset liability management solutions.
Private savings become more important due to the tensions in the first pillar. The
shift from public first pillar to private second pillar savings will impact the demand
from pension fund customers. They will be responsible for a larger fraction of their
wealth and they will bear the investment risk. Given the impossibility of covering
losses when they are retired, customers of pension fund schemes will ask for less
risky investments.
Financing and redistribution risks between the active insured and the retired persons are likely to grow in the future. There are several risk sources. Many countries
define a legal minimum fixed interest rate which has to be applied to the minimum
benefit pension plan. This rate is in Switzerland 1.75% for 2015 and will be 1.25%
in 2016. Given that the CHF swap rate for 10 years is in 2015 close to zero, it is
at the moment of writing not possible for a pension fund to generate the generate
the legally fixed rate using risk free investments. This defines the financing risk for
the active (or contributing) population to a pension plan.
To understand redistribution risk, we consider the technical interest rate. which is by
definition the discount rate for pensions. Since pensions cannot for example in Switzerland be changed after the day of retirement, a reduction of the technical interest rate
leads to higher capital for the retired population in order to maintain their pensions
unchanged. The technical rates are 2016 significantly higher in most low-interest countries than the interest rates: The pensions paid out are simply too high, see Figure 2.4.
Axa Winterthur (2015) estimates that in Switzerland CHF 3.4 bn are redistributed from
active insured to retired persons every year. If the ratio between the active and retired
populations changes in the future due to the demographics and longevity issue, in future
low interest periods the annually redistributed amounts will sharply increase.
38
Figure 2.4: The return of the 10y Swiss government bond, the minimum legal rate for
Swiss pension plans and the technical rate for privately insured retired individuals. If
this status remains unchanged in the next years then underfunding becomes a serious
issue and there can be no significant return expected from investment in the fixed income
asset class. The technical rates are even higher than the minimum rates which indicates
the extent at which actual pensions are too high. (Swisscanto [2015], SNB [2015], OAK
(2014]).
If the interest rates remain low in the future pension funds will be forced to consider
alternative investment opportunities. Most pension funds already invest more than previously in equity markets or in credit linked notes as substitutes for bond investments.
Another possible solution is the use of new investment strategies defined on the traditional asset classes. Smart beta strategies or factor investing are two such approaches.
Additionally, pension funds are searching for new investment opportunities such private
equity, insurance-linked investments or senior unsecured loans. Pension funds could and
should also reduce their costs. This would help. But it would by no means solve any
of the above challenges which are due to demography, low interest rate environment or
longevity.
We used the expression asset classes several times.
Definition 2.2.1. An asset class is a group of securities which possess the same characteristics and which are governed by the same rule and regulations.
Traditional asset classes are equity securities (stocks), fixed-income securities (bonds),
cash equivalents (money market instruments) and currencies. Alternative asset classes
2.2. INVESTORS
39
include among others real estate, commodities and private equity. Hedge funds are not
an asset class but an investment strategy defined on liquid asset classes.
Diversification of asset classes can evaporate in specific market situations - dollar
diversification is not the same as risk diversification. We define risk and risk factors.
Definition 2.2.2. The variability of the future value of a position due to uncertain events
define risk (Artzner et al.(1999)). Risk is modelled by random variables. Risk factors are
risks which affect the return of an investment.
2.2.3
Management of Pension Funds
The obvious approach to managing pension funds would consist in managing the assets
in order that they meet the liabilities. This means optimizing, the difference between
the asset and liability value (the surplus). Although this seems a trivial mathematical
extension of the asset-only optimization to asset liability management, this is not the case.
The mismatch of risk, growth, and maturity between assets and liabilities is one
reason for this. Asset values and risks are market driven. But the value and risk of
liabilities are primarily defined by the pension fund contributors characteristics and by
demographic changes and policy interventions - all non-market driven factors. Also the
growth rate of the liabilities turns out to be more stable than the assets growth rate. In
general, liabilities are less short-term but stronger long-term risky.
A second reason are the implicit or explicit return guarantees on the liability side.
Guarantees typically cap the linear payoffs of the liabilities. Non-linear payoffs for the
liabilities follow. But non-linear payoffs define derivatives. Contrary to standard financial derivatives on, say, an equity index, the pricing of these derivatives is much more
complex due to the risk sharing mechanism between the generations in a pension fund.
Hence, these derivatives are often neither priced nor hedged. Thus, turbulent market
conditions can adversely impact a pension funds objectives at risk.
Ang (2014) describes the management of pension funds by comparing different pyramids:
The know-how and compensation pyramid with the asset managers at the top and
the pension board on the bottom layer of the pyramid.
The return contribution pyramid with strategic asset allocation (SAA) at the top,
tactical asset allocation (TAA) in the middle, and title selection at the bottom.
SAA is asset allocation over a long-term horizon of 5-10 years. Asset allocation that
attempts to exploit return predictability over a short- to medium-term horizon is referred
to as TAA.
40
The SAA forms unconditional expectations about future returns using average historical returns. TAA is defined by a conditional expectation where information today
is used to forecast asset returns. The bets of a CIO form the TAA. The key question
is: Are asset returns predictable? If they turn out to be not predictable why are then
CIOs making permanently bets? Kandel and Stambaugh (1996) provide a possible answer. Consider investors who must allocate funds between stock and cash (the risk-free
Treasury bill). Despite the weak and non significant statistical evidence about the predictability of monthly stock returns, the evidence is used to update their beliefs about
the parameters in a regression of stock returns. The predictive variables can then have a
substantial impact on the investors portfolio decision, this means the TAA. We consider
predictability in the next section.
Since SAA primarily aims to create an asset mix that optimally balances between
expected risk and return for a long-term investment horizon, the SAA weights vary only
slowly. SAA divides wealth into different asset classes, geographical regions, sectors,
currencies, and the different states of creditworthiness of the counter-parties. Risk factors driving SAA include structural economic factors such as population growth rates,
technological changes, and changes in productivity and the political environment. The
dynamic weights in TAA may deviate from long-term SAA.
Although the concept of a TAA exists for more than 40 years, practitioners and
academics attribute different meanings to a TAA.
Fact 2.2.3. Practitioners use a one-period setup to define a TAA. Academics use intertemporal portfolio theory to derive the optimal investment rules. This defines a theoretic optimal TAA which has a myopic one-period and a long-term component.
The myopic part of the optimal TAA correspond to the practitioners TAA. The longterm component is missing in practice. We consider the myopic view in Section 2.8.5 and
the optimal TAA in Section 3.3.
Example Historical background TAA
We refer the reader to Lee (2000) for a detailed discussion. The first investment
firm to consider TAA was Wells Fargo in the 70s of last century. The drop in many
assets during the oil crisis in 1973-1974 raised the demand from investors to consider
alternatives to the shifts within a given asset class. Wells Fargo then proposed to shift
across asset classes - stocks and bonds. Using this system one was able to obtain positive
returns in a period where stock markets fell more than 40%. In the 1980s portfolio
insurance based on option pricing theory became popular. These dynamic strategies
attempt to receive a guaranteed minimum portfolio return. The so-called Constant
Proportion Portfolio Insurance (CPPI) approach in the mid 80s largely simplified the
option approach which made portfolio insurance even more attractive for investors. The
2.2. INVESTORS
41
global stock crash 1987 shifted the interest of investor away from portfolio insurance
back to TAA since the portfolio insurance strategies mostly failed to deliver the a
guaranteed floor value while TAA strategies suffered before the crash but outperformed
shortly after the crash.
In the following years, the interest rate increase, the growing stock markets and the
decline of volatility until 1995 made it more and more difficult for TAA managers to add
value. This short description indicates that the returns of TAA are episodic.
Returning to the management of pension funds, we note that the people at the top
sometimes have little investment knowledge. Their decisions about investment strategy,
however, are the most influential since they define SAA. At the bottom of the fund
hierarchy are the sophisticated asset managers. Their success is measured relative to
TAA and they try to generate excess returns (generally known as alpha) over the TAA
benchmark. Many empirical studies find that SAA is the most important determinant
of the total return and risk of a broadly diversified portfolio.
40 to 90 percent of returns are due to SAA and therefore come from the top of the
pyramid. Brinson et al. (1986) show that around 90 percent of the return variance
arrives from the passive investment part. Subsequent papers have clarified these
findings and estimate the importance of these returns being between 40 percent
and 90 percent (see, for example, Ibbotson and Kaplan [2000]). Schaefer (2015),
one author of the so-called professors report to the Norways Government Pension
Fund Global, states that the amount of active risk in the fund was very small.
The variance attribution to the benchmark return was 99.1% and only 0.9% was
attributed to the active return.
Between 5 and 25 percent are due to TAA and related to the Chief Investment
Officer (CIO) function.
Between 1 and 5 percent are due to security selection, which takes place at the
bottom of the pyramid.
2.2.3.1
Predictability
Definition 2.2.4. A return Rt+1 is predictable by some other variable Ft if the expected
return E[Rt+1 |Ft ] conditional on Ft is different from the unconditional expected return
E[Rt+1 ].
We use the notation Et (Rt+1 ) := E(Rt+1 |Ft ). When returns are not predictable,
prices are said to follow a random walk.
Definition 2.2.5. Let St be the price of an asset in discrete time with the dynamics
St = St1 + m + Wt , m R, S0 = s .
(2.17)
42
If the sequence of random variables (Wt ) is identically distributed with mean zero, variance 2 and zero covariance cov(Wt , Wt1 ) = 0, then St is a random walk with drift
m.
A fair coin toss gamble defines a random walk with zero drift. It follows for a driftless random walk that E(St ) = s for any t and var(St ) = t 2 . The fair coin toss is not
predictable since at each date, knowing the history of the realized coin tosses does not
help to predict the outcome of the next toss. The best guess of future cumulated gain is
the actual know gain.
The information Ft can be generated at each date by past returns, market information
or even private information. This allows us to state that a return is not predictable if at
any date conditioning on the information set does not add any value in predicting the
next return. If a return is not predictable, then the expected return is constant over time.
But this does not mean that the return itself is constant over time! Therefore, predicting
a return does not mean to be able to tell today what the return value or level will be
tomorrow. But it means to be able to state that there are period is stock returns where
the conditional expected return will be above or below the long term or unconditional
expected return.
Example Martingales, skill and luck
Assume that the random Rt sequence satisfies at each date
E Q [Rt+1 |Ft ] = Rt
(2.18)
where the information set Ft is generated by all past returns from time 0 up to time t
and the expectation is under a probability measure Q. If the random variables R are
integrable, then the process Rt is called a Ft -Q-martingale. If R is a martingale, then
whatever information is at time t, the conditional expectation of future returns equals
the present value. This follows from the tower property of conditional expectation, that
is
E Q [Rt+s |Ft ] = Rt , s 0 ,
(2.19)
holds for a martingale. Taking expectations in the last equation,
E Q [Rt+1 |Ft ] = E[Rt ]
(2.20)
shows that martingales are not predictable.

Martingales are key in asset pricing since the First Fundamental Theorem of Finance
states that the absence of arbitrage is equivalent to the existence of a probability
measure such that discounted prices are a martingale, see Section 3.10.1.
2.2. INVESTORS
43
We consider the impact of fair games on long term wealth level growth. Consider an
investor with initial capital v0 . She invests in each period 1 unit of her capital according
to a strategy. The outcome of the strategy are in each period a gain of +1 with probability
p or a loss of 1 with probability q = 1 p. She does not change her strategy over time
and the strategy is not looking backwards, that is we can describe the outcome with an
IID sequence (Xk ) of random variables. Her wealth after n periods reads
v0 +
n
X
Xk .
k=1
What is the probability that the investor will attain a wealth level vf > v0 ? We define
the event
Av0 ,n = {v0 +
n
X
Xk = vf , 0 < v0 +
k=1
m
X
Xk < vf , m < n} .
k=1
This event expresses that the investor reaches the desired wealth level in a finite number
n of plays and that the investor does not get bankrupt before time n. Since (Xk ) are
IID, the events (Av0 ,n )n are independent. Therefore the probability p(v0 , vf ) that the
investor reaches the desired wealth level vf sometime is given by
!
[
X
p(v0 , vf ) = P
Av0 ,n =
P (Av0 ,n ) .
n=1
n=1
A mathematical proposition states that p satisfies the following dynamics:

p(v0 , vf ) = p
p(v0 + 1, vf ) + q p(v0 1, vf ) .
(2.21)
The probability to reach her target wealth level at a given date is therefore equal to a
weighted sum where the weights (probabilities) are given by the fairness of the game.
The solution of this second order difference equation follows by using the guess
p(v0 , vf ) = A + Brv0 , r = q/p
(2.22)
if q 6= p with A, B two constants. The two constants are determined by the two conditions
p(0, vf ) = 0, p(vf , vf ) = 1. Summarizing, we get for vf > v0 :
( v0
r 1
, if p 6= q;
v
r f 1
(2.23)
p(v0 , vf ) =
v0
if p = q.
vf ,
If the game is fair (a martingale), then the probability to reach a 50 percent higher
wealth level than the starting value of 100 units is 66%. If the investors strategy has a
small skill component such that q = 0.49 and p = 0.51, then the probability to reach the
44
desired level is 98%!

If the same investors initial wealth level is ten times smaller, then the probability
in the fair game remains the same but in the skillful case the probability drops to 73%.
This shows that already a little amount of skill together with a high initial capital value
makes a big difference about the probability to reach a desired final wealth level.
Predictability from a forecast point of view uses regressions of returns R on a variable
xt of the form
Rt+1 = a + bxt + t+1
(2.24)
with a, b constants, t+1 a sequence of IID standard normal random variables. The variable xt can be the return itself or a market price variable such as price-dividend ratios.
The regression (2.24) becomes a random walk and therefore a not predictable variable if
b = 0 or if a = 0, b = 1 and xt = Rt .
The regression
Rt+1 = Rt + t+1
(2.25)
is random walk. Hence,

Rt+1 = R0 +
t+1
X
j , E(Rt+1 ) = R0 , 2 (Rt ) = t 2 .
j=1
This shows that R is martingale and that the variance increases over time. Therefore,
even if the conditional expected return is equal to the constant expected return, the
variance in the time series is not constant.
Assume a = 0 and that b > 0 is high in (2.24), i.e. the stock return is predictable. If
the signal xt is large, then
E(Rt+1 |Ft ) = bxt
is also large. Hence, you should buy the stock. But this is observed by many others too
which also want to buy. This drive todays price up which is the same as a decreasing
future return. Competition will therefore drive out any predictability in stock prices. In
other words, the single trader view which observes an investment opportunity is different
from how a market looks in equilibrium.
Example - Return predictability
Cochrane (2013) tests for lagged returns predictability by considering
Rt+1 = a + bRt + t+1
for US stocks and T bills using annual data, see Table 2.1.
(2.26)
2.2. INVESTORS
Object
Stock
T bill
Excess
45
b
0.04
0.91
0.04
t(b)
0.33
19.5
0.39
R2
0.002
0.83
0.00
E(R)
11.4
4.1
7.25
(Et (Rt+1 ))
0.77
3.12
0.91
Table 2.1: Regression of returns on lagged returns annual data 1927-2008. t(b) is the
t-statistic value and (Et (Rt+1 )) represents the standard deviation of the fitted value
bRt (Cochrane [2013]).
The result shows that stocks are almost not predictable while T bill returns are.
A value of b = 0.04 for stock means that a if returns increase by 10% this year the
expectation is that they will increase by 0.4% next year. Also the R2 is tiny and the
t-statistic is below its standard threshold value of 2. For the T bill returns the story
is different - high interest rates last year imply that the rates this year will again be
high with a high probability. Can this foreseeability of T bills be exploited by a trader?
Suppose first that stocks would be highly predictable. Then one could borrow today and
invest in the stock market. But this logic does not work for T bills since borrowing would
mean to pay the same high rate than one receives. To exploit T bill predictability the
investor has to change his behavior - save more and consume less today which is totally
different from the stock case. This is a main reason why one considers excess returns return on stocks minus return on bonds - in forecasting
Re,t = Rs,t Rb,t .
(2.27)
By analysing the excess return one separates the different motivations to consume less
and to save from the willingness to bear risk. Table 2.1 shows that considering excess
return we are back in the almost non-predictable stock case.
This example defines the starting point to the topic market efficiency. We will
consider:
What can be said about predictability if we consider longer time horizons?
What happens if we replace lagged returns in the forecasts by market prices?
2.2.4
Private Investors
Private investors differ in many respects from SWFs and pension funds. First, the biggest
wealth generator for them is neither natural resources nor contribution payments, but
their human capital. Second, individuals traverse a particular life cycle. While they are
young, they only have human capital and very little financial capital. During their lives,
that human capital generates income, which is transformed into financial capital. At
46
retirement, most individuals stop using their human capital to generate financial capital
and start to consume the accumulated financial capital stock. SWFs and pension funds
do not have a particular date on which their regular income terminates - in a broader
sense, they are ageless.
A further difference of individuals is their strong dependence on real-estate risk with
regards to their individual asset liability management. Since they do not have enough
capital to buy real estate they need mortgage financing. This leads to high leverages that is to say, the ratio of the assets (real estate) to the existing capital is large. Even
small changes in the assets - real estates - value can eliminate capital in the form of the
residual value between the assets (real estate) and the liabilities (a mortgage). The risk
of changes in asset values has two main drivers: interest rate risk and real-estate market
price risk. While increasing interest rates impact the budgets of individuals during the
whole financing period, a sharp decline in real-estate values leads to a sudden drop in
asset values probably below the liabilitys value. The management or mismanagement of
the asset real estate has been found to be one of the causes of many financial crises in
the past.
Example Leverage of private investors

Consider a private investor which is interested to buy a house with actual price CHF
1 million. The golden rule of affordability in Swiss banking states that the investor
needs to cover 20% of the house price with his own capital and that the interest rate
charge for the mortgage should not exceed 1/3 of regularly income where a prudent
banks uses not possible actual low interest rates but high possible interest rate levels to
calculate the charge. We assume 5% which means that the regular income of the investor
has to be not lower than CHF 3 0.05 8000 000 = 1200 000 which is lower than the
assumed income of CHF 1500 000. Suppose that the investor gets a mortgage with fixed
5 year rate of 1% which is in 2016 a plausible number due to the even negative CHF
interest rates. He therefore has to pay for the next 5 years without any amortization
payments CHF 80 000 per annum for the mortgage of CHF 8000 0000 which is a much
lower price to pay for living compared an individual which would rent the same object.
Assume finally, that the remaining liquid capital of the investor is CHF 1000 000.
0
000 000
Then the leverage ratio of the investor is = 1100
= 10, i.e. asset value over
0 000
capital value. Consider two scenarios. First, interest rates are up in five years such that
the investor then has to pay 3% for the interest rate charge. Second, house price fall
by 15% in the next five years. The first scenario implies that the investor has to pay
CHF 240 000 per annum for the interest rate charge - up 160 000 from the present level.
Although this three times more than at present the new numbers should not force a
default of the investor. In the second scenario. the house is only worth CHF 8500 000.
Since the investor should always cover 20% of the house price, the mortgage of 80% means
2.2. INVESTORS
47
a value of CHF 6800 000. This means, unless the bank decides different, that the investor
has to pay the difference of the old and new mortgage value of CHF 1200 000. This would
in present terms means more than the remaining capital value is. This indicates that real
estate investment means for private investor large leverage ratios and that house price
risk should be considered a more severe risk than interest rate risk.
2.2.5
Summary
SWFs liabilities are (unexpected) future shocks, which adversely affect the economy
of a nation.
Pension funds liabilities are the minimally defined pension plan amounts of money
of the contributors.
Individuals liabilities are defined by their consumption plans and planned bequests.
The financing of liabilities is a mixture of individual investments and retirement
provisions (the three pillars).
The three investor types also differ with regards to the investment products used to
finance their liabilities. As a rule, the more professional investors are, the more they
invest in cash products such as bonds and stocks. If necessary, they use the cash products
to replicate more complex payoff profiles on their own. They do not make use of mutual
funds or structured products, which wrap a possibly complex strategy into a single security. The resulting cost efficiency is the primary motivation for SWFs or large pension
funds to invest in cash products. Individuals and smaller pension funds prefer mutual
funds and structured products.
There are three main reasons for this. To gain a reasonable diversification in their
investments, investors need a capital amount that often encompasses their entire wealth.
We reveal below that a Swiss investor needs about CHF 1.5 million in order to achieve
a reasonable diversification, by investing in primary assets such as stocks and bonds
for example. The second reason is that individuals may not have direct access to some
markets. They cannot enter into short positions and are not allowed to trade derivatives
under the International Swaps and Derivatives Association (ISDA) agreement. If they
want to invest in such a profile, trading activities are needed. They are forced to buy
derivatives in the packaged form of a mutual fund or a structured product. Finally, the
unbundling of investment strategies to the level of cash products requires know-how and
technology. For private clients and smaller institutional investors it is more profitable
to outsource these trading activities and to invest instead in wrappers of the investment
strategies - funds, derivatives, ETFs, etc.
48
2.3
The Efficient Market Hypothesis (EMH)
Predictability, see Section 2.2.3.1, is part of the broader concept the Efficient Market
Hypothesis (EMH).
Malkiel (2003):
Revolutions often spawn counterrevolutions and the efficient market hypothesis [EMH]
in finance is no exception. The intellectual dominance of the efficient-market revolution
has more been challenged [sic] by economists who stress psychological and behavioral elements of stock-price determination and by econometricians who argue that stock returns
are, to a considerable extent, predictable.
Lo (2007) describes the situation as follows:
The efficient market[s] hypothesis (EMH) maintains that market prices fully reflect
all available information. [...] It is disarmingly simple to state, has far-reaching consequences for academic theories and business practice, and yet is surprisingly resilient to
empirical proof or refutation. Even after several decades of research and literally thousands of published studies, economists have not yet reached a consensus about whether
markets - particularly financial markets - are, in fact, efficient.
Asness and Liew (2015) state:
The concept of market efficiency has been confused with everything from the reason
that you should hold stocks for the long run to predictions that stock returns should be
normally distributed to even simply a belief in free enterprise.
These statements lead us to ask: What is the EMH? Why does the EMH define a
revolution? What does it mean for asset management? What do we know and what do
we not know?2
Definition 2.3.1. A financial market is efficient when market prices reflect all available
information about value.
All available information includes past prices, public information, and private information. These different information sets lead to different EMHs (see below). Reflect
all available information means that financial transactions are zero-NPV activities. Financial market efficiency roughly means that markets work perfectly that investors form
expectations, markets aggregate information efficiently and equilibrium prices incorporate all available information instantaneously.
2
This section is based on Fama (1965, 1970, 1991), Cochrane (2011, 2013), Malkiel (2003), Asness
(2014), Lo (2007), Nieuwerburgh and Koijen (2007), and Shiller (2014).
2.3. THE EFFICIENT MARKET HYPOTHESIS (EMH)
49
Unfortunately, while intuitively meaningful, the statement regarding reflecting all

available information does not say what reflecting this information means. Suppose that
a company announced it expected to earn twice as much as its earnings targets. Do
stock prices double, triple, or fall by 20 percent? An equilibrium model is needed of how
security prices are set. Efficiency testing means to test whether the properties of expected
returns implied by the model of market equilibrium are observed in actual returns. This
is referred to as the joint hypothesis problem (Fame [1970]). This means that the
EMH has two pillars Pillar 1: Do prices reflect all available information - that is, are prices market
efficient? Prices can only change if new information arrives. The information
content.
Pillar 2: Developing and testing asset pricing models. The price formation mechanism (Asset Pricing Model).
Let Rt+1 be an assets return, FM the assumed information used in the market to
set the equilibrium price of the asset and F the real information used in the market to
form asset prices. Market efficiency means that the expected returns at t + 1 given the
two information sets at time t are the same
E(Rt+1 |FM,t ) = E(Rt+1 |Ft ) .
(2.28)
The standard asset pricing equilibrium model of the 1960s assumed that the equilibrium
expected returns are constant: E(Rt+1 |FM,t ) = constant. If the EMH (2.28) holds, then
E(Rt+1 |Ft ) = constant
follows. To test the EMH, the regression of the future Rt+1 returns on the known information Ft should have a zero slope. If this is not the case, the market equilibrium model
could be wrong or the definition of FM,t overlook information in price setting - FM,t and
Ft are not equal.
Remarks
The EMH does not hold if there are market frictions: Trading costs and the cost of
obtaining information must, hence, be zero. In the US, reliable information about
firms can be obtained relatively cheaply and trading securities is cheap too. For
these reasons, US security markets are thought to be relatively efficient. Grossman
and Stiglitz (1980) show that perfect market efficiency is internally inconsistent - if
markets were perfectly efficient, there would be no traders in the economy to make
them perfect. Therefore, the level of efficiency differs across markets.
The EMH does not make any explicit statements about the rationality of investors.
But to operationalize the EMH one often assumes rationality. Expressing the EMH
by using conditional expectations means assuming that investors develop expectations in a rational form. The rational form of the EMH is related to the random
walk hypothesis (see below).
50
Although the EMH is applicable to all asset classes.
The EMH does not assume that all investors have to be informed, skilled, and able
to constantly analyze the information flow. One can prove that market efficiency
is possible even if a small number of market participants are informed and skilled.
If prices aggregate all available information, then investors are not able to make
risk-adjusted, profits based on this information set (Jensen [1978]).
If the EMH holds true, then prices react quickly to the disclosure of information.
The most efficient market of all is the one in which price changes are completely
random and unpredictable. In such markets Shiller (2014) states: [...] there is
never a good time or bad time to enter the market [...]
The EMH is associated with the idea of a random walk with zero drift which is then
a martingale. Necessary for a price process to be a random walk is that the information
flow is unimpeded and immediately absorbed in the prices. The martingale process
property
E(St+1 |Ft ) = St , t,
(2.29)
operationalizes the EMHs assertion that market prices fully reflect all available information.
For different type of information F, different forms of the EMH follow. Fama (1970)
defines three different forms of market efficiency. In the weak-form EMH, the information used in the EMH is all available price information at a given date. Hence, future
returns cannot be predicted from past returns or any other market-based indicator. This
precludes technical analysis from being profitable. In the semi-strong EMH, the information used in the EMH is all available public information at a given date. In addition
to price information, other data sources including financial reports, economic forecasts,
company announcements - and so on - matter. Technical and fundamental analyses are
not profitable. In the strong-form EMH, the information used in the EMH is all available public and private information at a given date. This extreme form serves mainly as
a limiting case - no type of investor can obtain an excess return even if they have insider
information.
The rational expectation EMH can be rewritten in the form
E(Rt+1 E(Rt+1 |Ft )|Ft ) = 0 .
(2.30)
Hence, the expected return equals the realized return on average. There are no systematic errors in predicting future returns that could be used to make extraordinary profits.
This implies predictability, see Definition 2.2.5.
51
Since tomorrows stock price S plus its dividend D equals the present stock price
multiplied by the return, we can rewrite (2.30) as
St =
E(St+1 |Ft ) + E(Dt+1 |Ft )

.
1 + E(Rt+1 |Ft )
(2.31)
The expected return in the denominator has to be determined in a separate model. If the
random walk hypothesis holds, (2.31) simplifies and if expected dividends are assumed
to be constant, the basic value equation follows
St =
D
.
R
(2.32)
Empirical evidence shows that expected returns and dividends are both not constant
over time. Therefore, (2.32) is too naive. More precisely, if P/D ratio
St
= constant
Dt
then the volatilities of the growth rates are the same:

dSt
dDt
volatility
= volatility
.
St
Dt
But the return volatility is around 16% while the dividend volatility is only about half this
value (around 7%). Therefore something else must be time varying. Furthermore, this is
a further volatility puzzle, the return volatility is time varying. Monthly market return
volatility fluctuated between values of 20% and more in market stress periods (Great
Depression, Great Financial, Crisis) and 2% in the 60s and mid-90s of last century. But
not only We reconsider this issue after some examples in the next section.
Example
Even in efficient markets investors can by chance alone outperform the market for a
very long. Assume that an investor has a 50 percent chance of beating the market in a
given year. If one assumes that this investors performance is IID, the chance of beating
the market in the next 10 years is 1 bps - one percent of one percent. But if we consider
10, 000 investors with the same performance rate, the probability that at least one of
them will outperform the market in the next 10 years is 99.99 percent. This is similar
to a lottery: the individual winning probability is virtually zero, but someone will win
almost certain.
Example
52
A well-known story tells of a finance professor and a student who come across a
hundred dollar bill lying on the ground. As the student stops to pick it up, the professor
says, Dont bother - if it were really a hundred dollar bill, it wouldnt be there. The
story illustrates well what financial economists usually mean when they say markets are
efficient. Suppose that the student assumes that nobody tested whether the bill is indeed
real but that all individuals assumed that someone else checked whether the bill was real.
If his or her belief is true, there were no efforts made to generate the information needed
to value the bill. But if nobody faced the costs of generating that information - checking
whether the bill is real or not - then Ft is the empty set. But then EMH cannot hold.
This shows that a reasonable assumption about human behavior, illustrated by replacing
the belief all assume that the bill is not real, otherwise someone would already have
taken it by the belief all predecessors assumed that their predecessors verified whether
the bill was real but no one knows whether anybody checked it out, leads to a violation
of the EMH.
Example
Suppose that firm X announces a new drug that could cure a virulent form of cancer.
Figure 2.5 shows three possible reactions of the price paths. The solid path is the EMH
path: prices jump to the new equilibrium value instantaneously and in an unbiased
fashion. The stock price should neither under-react nor overreact to the announcement.
The dotted line represents a path where market participants overreact and the dashed one
where they under-react. The dash-dotted line, where the new price is reached several
days before the announcement is made, reflects insider trading, front running, or any
other form of illegal trading.
Example
Assume that all investors follow a passive buy-and-hold investment strategy. Then,
prices would fail to reflect new information which would generate profit possibilities for
active investors which would then improve market efficiency. If the EMH holds, should
an investor then randomly pick stocks? No, the EMH does not state that investors
preferences are irrelevant in making investment decisions. There is still a need to optimize
the portfolio. If diversification results from optimizing program, randomly picking stocks
does not provide for example a well-diversified portfolio.
Example
Consider the joint hypothesis that the EMH holds and that the CAPM is the
equilibrium price model. The CAPM states that the expected return for any security is
53
Figure 2.5: Possible price reactions as a function of the day relative to the announcement
of a new drug.
proportional to the risk - beta - that security. The joint hypothesis is rejected in many
studies but which of the two (or both) are rejected? Either, the EMH may be true, but
the CAPM fails to accurately model how investors set prices since there are for example
other risk factors. Rational asset pricing academics favor the possibility that the CAPM
is the wrong asset pricing model. There are other risk sources that are not reflected by
the market beta. Or, the CAPM is correct investor fail at it because of their behavioral
biases or errors. Finally, both the EMH and the CAPM are wrong.
Behaviorists think that markets are not efficient. Behavioral bias causes mis-pricings
- that is to say, pricings not solely based on risk. Biases cause prices to move to strong
in both directions. For instance, investors over-extrapolate both good and bad news and
thus pay too much or too little for some stocks, and simple price multiples may capture
these discrepancies.
2.3.1
Predictions
The forecasts in Section (2.24) considered short time horizons and past returns we considered to predict future returns. We consider longer time horizons and use market prices
or yields to forecast returns. This section is based on Cochrane (2005).
54
Following the dividend/price (D/P) issue of last section, we consider the returnforecasting regressions of Cochrane (2013) in Table 2.2. The regression equation reads
e
Rtt+k
=a+b
Dt
+ t+k
St
(2.33)
with Re the excess return defined as CRSP3 value-weighted return less the three-month
Treasury bill return. The return-forecasting coefficient estimate b is large and it grows
Horizon
t(b)
R2
e ))
(Et (Rt+1
e
(Et (Rt+1
))
e
E(Rt+1 )
1 year
5 years
3.8
20.6
(2.6)
(3.4)
0.09
0.28
5.46
29.3
0.76
0.62
Table 2.2: Return-forecasting regressions, 1947-2009, annual data. t(b) is the tstatistic value and (Et (Rt+1 )) represents the standard deviation of the fitted value
Dt
e
t
bD
St , (Et (Rt+1 = (b St ) (Cochrane [2013]).
for longer time horizon. Hence, high dividend yields (low prices) mean high subsequent
returns and low dividend yields (high prices) mean low subsequent returns. The R2 of
0.28 is large when we compare it with an R2 of predicting stock returns on say a weakly
basis which are seen to be not predictable. Therefore, excess returns are predictable by
D/P ratios.
The above tests are not stable in the following sense. First, the statistics depends
on the sample size. The point estimate of the return forecasting coefficients and its
associated t-statistic vary significantly if different sample periods are considered. Second,
the definition used for dividends impacts the results. If one for example adds repurchases
to the definition of the variable D, then the statistics changes.
If we take conditional expectations in equation (2.33), then
e
Et (Rt+1
)=a+b
Dt
.
St
(2.34)
Since the dividend/price ratio varies over time between 1 and 7, returns are predictable
is the same as to say that expected returns vary over time. Using b = 3.8 the variation
of D/P by 6 percentage points turns into a long-term variation of expected returns of
3.8 6 = 22.8 percentage points. Given that the long-term average expected return is 7
percentage points, the variation is huge.
When we consider longer time horizons, the R2 gets bigger but the t-statistics do not
improve: Long-term forecasts possess the same information as short-run forecasts - this
explains why the statistics did not get any better for longer time horizons. The basic
observation is that D/P is persistent like interest rates. In the regression
Dt
Dt1
= a + 0.94
+ t
St
St1
3
Center for Research in Security Prices at Chicago Booth business school.
(2.35)
55
the value 0.94 shows persistence of D/P. This persistence impacts the return in (2.33):
Long run return coefficients rise with horizon and dividend yields forecast returns more
than one year ahead. This follows by iterating the equations. This explains the long-term
results from short-term ones mechanically if the forecasting variable is persistent.
When we analyze the regression of dividend growth, then
(2.33). Cochrane (2013) states:
Dt+k
Dt
replaces the return in
Returns, which should not be predictable, are predictable [see Table 2.2]. Dividend
growth, which should be predictable, is not predictable. The point estimate of the dividend
prediction is slightly positive, which is the wrong sign. The t-statistics and the R2 are
miserable, though, meaning this coefficient is zero for all practical purposes.
To provide an interpretation, assume that expected returns are constant which is the
traditional efficiency view. Assume that price are falling relative to current dividends.
Then in this view future dividends should also decline. That is, dividends have to be
predictable since they have to approach the low price levels. The above observation states
that on average we observe a different pattern:
When prices decline relative to dividends, we see a higher return as prices slowly
rebound, and there seems to be no expectation of changing future dividends. (Cochrane
(2013).
Hence, returns are predictable because dividends are not.
Given this discussion about predictability - are the markets inefficient? Not necessarily. If the equilibrium asset pricing models implies time varying expected returns then
predictability does not means market inefficiency. We consider a model at this point and
start with the Fundamental Asset Pricing equation (3.15),
!
j
X
Y
1
St =
Et
Dt+j .
(2.36)
Rt+1
j=1
k=1
This states that the price of the assets is equal to all expected discounted dividends where
the discount factor is given by the variable expected returns in the future periods. Without much loss of generality, we consider a one-period model and to use log-variable which
turns the different ratios into differences. Using lower case symbols for log variables, the
one-period formula reads:
st dt = Et (dt+1 ) Et (rt+1 ) .
(2.37)
If expected future dividends are higher, prices go up and if expected returns rise, the price
goes down. Therefore, higher expected return in equilibrium corresponds to a lower price.
56
Predictability is related to the volatility of prices. Let St be the actual stock price
and St the ex post realized rational stock price. Shiller states that if prices are expected
discounted dividends - St = Et (St ) - then prices should vary less than their expected
variables: 2 (St ) > 2 (St ) holds for any random variable t with St = St +t . But prices
vary wildly more than they should even if we knew future dividends perfectly. This is
the so-called excess volatility of stock returns pointed out by Shiller.
We claim that return predictability and excess volatility represent the same cause:
The price-dividend volatility is in a one-to-one relationship with return predictability
observed in the above regressions. Consider equation (2.37). If expected dividend growth
or returns is constant, then price-dividend ratios would is also constant. But since pricedividend ratios vary, investors expectations of dividend growth or returns must vary
through time. To obtain an equation for the variance we first write regressions of returns
and dividend growth on dt pt with br , bd the respective coefficients. Plugging the
regressions into (2.37) we get:
1 = br bd , 0 = t+1,r t+1,d
(2.38)
where the residuals enter the two regression. Therefore, the expected return can be
higher if the expected dividend is higher or the initial price is lower. The only way the
unexpected return can be higher is if the unexpected dividend is higher, since the initial
price cannot be unexpected. Since a regression coefficient is covariance over variance,
1 = br bd reads:
2 (pt dt ) = cov(pt dt , dt+1 ) cov(pt dt , rt+1 ) .
(2.39)
This shows that D/P ratios can only vary if they forecast dividend growth or forecast
returns in regressions. Since the difference between the two coefficients must be one
(2.38), if one coefficient is small in the regression then the other one has to be large.
To estimate the size of the predictability and return variance one has to enlarge the
above model to many periods. Essentially, (2.37 is replaced by

j
X
1
st dt
Et (
(dt+j ) Et (rt+j )) .
(2.40)
1 + D/P
j=1
The more persistent r and d are the stronger is their effect on the D/P ratio since more
terms in the summation matter. If dividend growth and returns are not predictable,
this means their conditional expectations are constant over time, then the D/P ratio
is constant which is not observed. This extension to many periods for the D/P ratio
trivially also holds for the variance equation (2.39) where the discounted summation
enters in the return and dividend growth variables. As in the one-period model, the
now long-run return and long-run dividend growth regression coefficients must add to
one. By regressing the long-term return and dividend growth one finds [Cochrane (2013)]:
57
Return forecasts - time-varying discount rates - explain virtually all the variance of
market dividend yields, and dividend growth forecasts or bubbles - prices that keep rising
forever - explain essentially none of the variance of price.
This changes the classic view of the EMH. Traditionally, expected returns were assumed to be constant (asset pricing model) and stocks were martingales with zero drift
(random walks). In this reasoning, low D/P ratios happens when people expect declines
in dividend growth and variations in D/P are due to cash flow news entirely (dividend
predictability). The above result states that the opposite is true. The variance of D/P
is due to return news and not to cash flow ones.
2.3.1.1
Bubbles
Often bubbles are used as matters of fact such as the the housing bubble for example.
On this level of argumentation it is difficult to find an operational definition of a bubble
or even measurable procedure. It is consequently impossible to know what were talking
about. One approach are so-called rational asset price bubbles. For Eugene Fama, a
bubble is a situation in a speculative market where prices grow exponentially until they
crash. This assumes in some sense that people buy just because they think they can sell
to a greater fool. This is a rational bubble and represents a violation of the transversality
condition in the optimal investment program of an investor. Hence, expected returns are
always the same, so higher valuations do not make it more likely to see a low return.
Formally, one can add to the fundamental asset pricing equation solution (3.1) in
many periods linearly a second function of a particular type - the bubble function - such
that this combination of the two functions, the expected values in (3.1) and the bubble,
still solve the fundamental asset pricing equation. Bubble functions possess the property that their expected value explodes to plus or minus infinity if forecast time tends
to infinity. Summarizing, in an infinite horizon model, rational asset price bubbles are
possible but additional aspects of the economic environment can often rule them out.
Data, see Cochrane (2013), speak strongly against this form of bubble: Higher valuations do correspond to lower returns. To understand the difference from behavioral
finance bubbles, in which economic ideas are connected to psychology or sociology to
define a phenomenon, Robert Shiller (2014) states:
A situation in which news of price increases spurs investor enthusiasm which spreads
by psychological contagion from person to person, in the process amplifying stories that
might justify the price increases and bringing in a larger and larger class of investors,
who, despite doubts about the real value of an investment, are drawn to it partly through
envy of others successes and partly through a gamblers excitement.
While Fama makes no reference to any science other than that of the statistics of
the asset prices, in Shiller uses emotions of investors, the new flow and the type of infor-
58
mation media to define bubbles. The definition is not about irrationality but about how
investors are buffeted en masse from one superficially plausible theory about conventional
valuation to another. (Shiller [2014]).
2.3.2
Importance of EMH for Asset Management
Passive Investing
Eugene Famas work on market efficiency (1965, 1970) triggered passive investing with
the first index launched 1971.
Active Management
If efficient markets buying and selling securities is a game of chance rather than one of
skill. Active management is then a zero-sum game. If the EMH holds, the variation of
the returns of the active managers around the average is driven by luck alone. Often
strong past performers underperform in subsequent periods. Many studies found little
or no correlation between strong performers in one period and those in the next. This
lack of persistency supports the EMH. Figure 2.6 illustrates this issue.
Suppose that one is able to pick in advance those managers who outperform others.
As per the EMH, investors would give them all their money; no-one would select those
managers doomed to underperform. But since not all active managers can outperform
the market performance, this process would be self-defeating.
Technical and Fundamental Analyses
The same conclusion as for active management holds for technical and fundamental analysis - they are useless in predicting asset prices. Both technical analysis, the study of past
stock prices to predict future prices, nor fundamental analysis, the analysis of financial
information such as company earnings to select undervalued stocks, do not generate
higher returns than a randomly selected portfolio of individual stocks with comparable
risk does. The value of financial analysts is not in predicting asset values but to analyse incoming information fast such that the information is rapidly reflected in the asset
prices. In this sense analysts support the EMH.
Benchmarks
If an efficient market there is no method which results in outperforming an appropriate
benchmark.
Role of Investment Professionals
If markets are efficient, what role can investment professionals play? Their role is to find
optimal portfolios for the investors. This means to understand the preferences of the
investor and his living circumstances which includes for example his tax environment.
This is a many dimension problem where income, age, asset and liability structure matter
for example.
59
Figure 2.6: Performance ranking of the top 20 equity funds in the US in the 1970s and in
the following decade. The average annual rate of return was 19 percent compared to 10.4
percent for all funds. In the following decade, the former 20 top funds had an average
rate of return of 11.1 percent compared to 11.7 percent for all funds (Malkiel [2003]).
2.3.3
Evidence for the EMH
At present, many scientists believe that stock prices are at least partially predictable. A
reason for this is the increasing importance of psychology and behavioral sciences in economics. Behavioral finance economists believe in the predictability of future stock prices
by using past stock price patterns and certain fundamental valuation metrics. But are
these patterns persistent? Schwert [2001] documents that many predictable patterns
seem to disappear once they are published. There is also a publication bias - significant
effects are published while negative results or boring confirmations of previous findings
are not published.
One explanation for the non-persistence of the patters are researchers data mining
activities. The power of analytical tools to analyse without hugh effort a data set from
many different angles makes it quite likely to find some seemingly significant but spurious
correlation in the data: It is possible to generate almost any pattern out of most data sets.
60
2.3.3.1
Short-Term and Long-Term Momentum
Lo and MacKinlay (1999) find that short-run serial correlations are not zero and that
the existence of too many successive moves in the same direction enables them to reject
the hypothesis that stock prices behave as random walks. There is some momentum in
short-run stock prices.
Even if the stock market is not a perfect random walk, its statistical and economic
significance have to be distinguished. The statistical dependencies are very small and
difficult to transform into excess returns. Considering transactions costs for example
will annihilate the small advantage due to the momentum structure (see Lesmond et al.
[2001]).
The situation is different for long-term investment. If the small serial correlation
is persistent over time, then it can accumulate to large long-term figures. This fact is
confirmed by many studies. Fama and French (1988) document that 25 to 40 percent
of the variation in long-holding-period returns can be predicted in terms of a negative
correlation with past returns. Behaviorists Some attributed this forecastability to stock
market price overreaction which is due to investors facing periods of optimism and
pessimism which cause the deviations from the fundamental asset values (DeBondt and
Thaler (1995)).
There results about long-run negative serial correlation in stock returns differ for
different studies and for different time periods. Return reversals for the market seem to
be consistent for many market participants. Since interest are mean reverting, stocks
must risk or fall to be competitive to bonds.
2.3.3.2
Accruals Anomaly
Sloans (1996) accruals anomaly is one of the most closely studied topics in accounting. Academics are still discussing on whether the anomaly really represents market
mis-pricing, what causes it, and whether an investor can earn rents by trading on it.
Accruals are the piece of earnings made up by accountants. The other part of
earnings is cash flows from operations. Sloan (1996) shows that one should trust the
cash flows more and he analysis whether investors have figured this out. The answer is
no - they are, instead, focusing on earnings.
Sloan performs several tests to provide answers to the following questions. Are
accrual-driven earnings of lower quality than cash flow-driven ones? Sloan considers
whether high earnings are less persistent if they are driven by accruals. He confirms that
high earnings driven by accruals are more likely to drop compared to earnings driven by
cash flows. Do investors use information regarding accruals and cash flows to forecast the
persistence of earnings? To answer this question, Sloan considers the subsequent stock
returns earned by portfolios of firms with extreme earnings - driven by accruals and cash
flows, respectively. Sloan (1996):
61
If investors understood that firms with high accruals were likely to have lower future
earnings, then we shouldnt expect to see abnormal future returns for a portfolio of highaccrual firms. But if investors failed to heed the warnings offered by the high accruals,
we would expect to see unusually low future returns to a portfolio of high-accrual firms.
The tests indicate that the highest accrual portfolio has the lowest future return in the
two following years. This is in line with the expectation if investors did not anticipate
that for high-accrual firms the future earnings will be more likely low.
Sloan (2011), Leippold and Lohre (2011), and Green et al. (2009) either reconsider
the anomaly using more recent data or test whether it is indeed an anomaly at all. All
these authors find that the strength of the anomaly in the US has decreased since 1996.
Green et al. (2009): In this paper, we bring evidence to bear on these questions by studying the anomalys current demise - namely, the observation that the hedge returns to
Sloans (1996) accruals anomaly have decayed in US stock markets to the point that they
are no longer positive.
One explanation for this decrease of the accruals anomaly are hedge funds.
2.3.3.3
Value Stocks
Stocks with low P/E or P/B multiples (value stocks) provide higher rates of return
than stocks with high P/E ratios. Asness and Liew (2014) discuss what drives the factor
value - risk (rationality) or behavior?
Consider an HML, high minus low, trading strategy. The strategy is long a diversified portfolio of cheap US stocks (low P/B ratios) and short a portfolio of stocks with
high P/B ratio (expensive stocks) defines the value strategy. This strategy has done well
on long time horizons. For the last 85 years the return has been around 400 percent.
But this strategy also suffered from heavy losses in the Great Depression of the 30s, in
the 90s and the return has remained almost unchanged in the last 10 years.
If we consider value to be a sustainable risk factor, then the risk must not be diversifiable. In the tech bubble both cheap and expensive stocks got cheaper at the same
time independent of the idiosyncratic risk type. It seems to be a norm that cheap assets
and expensive assets tend to move in parallel. This observation is not a proof that value
is a risk factor but it follows from rational risk-based model explanation. To challenge
the risk perspective, consider the many offerings of long value stocks and short growth
stocks. If value has a rational basis, then there has to be a market for the opposite trade.
But such an offering by-and-large is not seen in the markets.
Asness and Liew (2014) believe that a significant amount of the efficacy of value
strategies is behavioral. They state the coincidence of investors time horizon chosen for
performance valuation and the asset cycle when assets become cheap and expensive -
62
three to five years. Hence, investors act like momentum traders over this horizon. This
behavior leads to mis-pricing or inefficiency in the direction of value.
2.3.3.4
The Performance of Professional Investors
Direct tests of the professional investors performance represents a compelling evidence of

market efficiency. A large part of evidence suggests that they are not able to outperform
index funds. Jensen (1969) found that active mutual fund managers were unable to
add value and underperform the market by the fees charged, see Section 4.3.4.3 for a
discussion about luck and skill in mutual funds and hedge funds management.
2.4
Wealth of Nations
Will the generation of wealth be scarce or abundant? The generation of wealth is the
raw material of asset. Figure 2.7 shows the relative distribution of wealth worldwide over
the last 2000 years in absolute and relative terms.
Figure 2.7: The territory size shows the proportion of worldwide GDP that was produced in that territory in the years in question. GDP is measured in USD equalized for
purchasing power parity. In each chart the total wealth level in USD is shown. 1 AD
means the year 1 anno Domini in the Julian calendar (worldmapper.org).
In the period from 1 AD to 1500 AD, the distribution of wealth was largely proportional to the population. This reflects the importance of the impact of the factor labor
2.4. WEALTH OF NATIONS
63
and the relatively minor differences in technology across territories. The picture changed
in the following centuries up to 1900. Europe and northern America dominated the rest
of the world. This picture changed only moderately until the end of colonialism in the
1960s: China, India, and Africa lost - in proportional terms - size compared to a combination of the Old World and northern America due to the latter groupings economic
and technological dominance. The last 20 years show that Japan has reached a turning
point while China and India have increased their size, and that Europe is losing ground
relative to the other regions. In absolute values, it took 400 years to double worldwide
GDP from USD 1 trillion to USD 2 trillion (1500-1900), but it took only 30 years to
triple that value from USD 8 trillion in 1960 to USD 27 trillion in 1990.
GDP is not assets and nor is it assets under management (AuM). But the growth
rate of GDP is a main generator for assets and asset growth. Assets under management
(AuM) is the market value of assets that an investment company manages on behalf of
investors. The figure AuM is often used as a measure of growth indication between asset
managers. Since the profitability varies heavily for different types of assets, AuM should
be used only with caution to draw any conclusion about asset managers profitability.
There are also widely different views regarding what AuM mean. AuM is, for example,
reported to consultants and clients in Global Investment Performance Standards (GIPS)
compliant performance presentations, company profiles, and a variety of industry surveys.
PwC (2015) estimates that global AuM will exceed USD 100 trillion by 2020, up from
USD 64 trillion in 2012. Other consulting firms estimate similar figures using their own
models. The PwC figures would result in an annual compounded growth rate of 6 percent.
The growth rate will be different for different geographic regions. The projections of the
growth rate until 2016 were (Boston Consulting Group [2012]):
Old World, northern America, Japan: 1.7% p.a.
South America, BRIC states, Middle East, Eastern Europe: 9.7% p.a.
The growth of wealth shows that the raw material for asset management services
exists. The different growth rates define opportunities for asset managers, in developed
countries, to offer solutions in faster growing markets. Therefore, market access will play
a prominent role in the evolution of asset AM. Considering data published by consulting
firms always means to face data risk since the data are not public and the conclusions of
the consultants cannot be verified or replicated by a third party.
The growth of wealth is per se not beneficial to societies if other characteristics like
inequality of wealth distribution are also growing. Growth of wealth in a nation accompanied by a parallel growth of wealth inequality generates social and political instability.
But economic growth is an important, even necessary, condition for overcoming societal
risks: the growth of wealth in recent decades was the major reason why poverty diminished globally at a rate that has never been observed before in history.
64
The global wealth projections of PwC (2015) result in AuM projections for different
types of investors, as shown in Table 2.3.
Clients
Pension funds
Insurance companies
Sovereign wealth funds
HNWIs
Mass affluent
2012, USD tr.

33.9
24.1
5.2
52.4
59.5
2020, USD tr.

56.5
35.1
8.9
76.9
100.4
Growth rate p.a.

6.5%
4.8%
6.9%
4.9%
6.7%
Table 2.3: There is double counting in these figures. Assets of the mass affluent and high
net worth individuals (HNWIs) will be invested with insurance companies and pension
funds. Mass affluent refers to individuals with liquid assets of between USD 100,000 and
USD 3M. HNWIs possess liquid assets of USD 3 - 20M. The categorization is not unique
(PwC [2015]).
According to the PwC report, mass affluent clients and HNWIs in emerging markets
are the main drivers of AuM growth. The global middle class is projected to grow by
180 percent between 2010 and 2040, with Asia replacing Europe as home to the highest
proportion of middle classes as early as in 2015 (OECD, European Environment Agency,
PwC [2014]). The growth of pension funds will be large in countries that have fast
growing GDPs, weak demographics and which are using defined contribution schemes.
2.5
Who Decides?
Investors can decide themselves about their investments or delegate the decision to a third
party. This decision has, in any case, far-reaching consequences, since it automatically
brings an extensive regulatory framework into play. Subsequently, we will focus on some
issues of the MiFID II regulatory framework and on that frameworks impact on client
and intermediary channel segmentation.
Decision-making today has to comply with many more regulatory standards than in
the past. Regulation defines constraints and rules for decision-making, but it never sets
the goals for business. Even several years after the outbreak of the GFC, many banks and
their asset management divisions did not have a fully strategic response to the ensuing
regulatory changes but rather adopted a stand-alone approach to each new regulatory
element - that is, mastering capital requirements, balance sheet, consumer protection,
and market regulation. Leading international banks were the first to integrate the regulatory program into their strategic planning and to deploy resources rapidly to meet
emerging activities. Figure 2.8 illustrates the avalanche of regulations and their time line
of implementation.4
4
PRIIPs are the Packaged Retail Investment and Insurance-based investment Products documents
2.5. WHO DECIDES?
65
Figure 2.8: Regulatory initiatives and their implementation time line. See the footnote
for the description (UBS [2015]).
Regulation has both a strategic and operational impact on asset management with
different strengths for the different regulatory initiatives. High operational impact has
UCITS, EMIR or MiFID II. Low strategic impact has the product information documents PRIIPS, the benchmark regulation MAD II. MiFID II, FIDLEG, the Volcker Rule
or Dodd-Frank Act has a high strategic impact.
and UCITS is The Undertakings for Collective Investment in Transferable Securities Directive for collective investments by the European Union. Obligations for central clearing and reporting (EMIR, Dodd
Frank) and higher capital requirements for non-centrally cleared contracts (CRR), the obligation to trade
on exchanges or electronic trading platforms is considered by revising MiFID, the so-called The Markets
in Financial Instruments Regulation (MiFIR). US T+2 means the realization of a T+2 settlement cycle
in the US financial markets for trades in cash products and unit investment trusts (UITs). FIDLEG is
part of the new Swiss financial architecture which should be equivalent to MiFID II of the euro zone.
In 2013, following the LIBOR and EURIBOR market-rigging scandals, the EU Commission published
legislative proposal for a new regulation on benchmarks (Benchmark Regulation). The Asia Derivative
Reform mainly focus on the regulation of OTC derivatives and should therefore be compared with EMIR
and Dodd-Frank Act. The Market Abuse Directive (MAD) in 2005 and its update MADII resulted in
an EU-wide market abuse regime and a framework for establishing a proper flow of information to the
market. BCBS considers principles of risk data aggregation and reporting by the Basel Committee on
Banking Supervision. Comprehensive Capital Analysis and Review (CCAR) is a regulatory framework
introduced by the Federal Reserve in order to assess, regulate, and supervise large banks and financial
institutions. EU FTT means the EU Financial Transaction Tax. IRS 871 (m) are regulations of the
IRS about dividend equivalent payment withholding rules for equity derivatives. CRS are the Common
Reporting Standards of the OECD for the automatic bank account information exchange.
66
This ability of the most severely hit banks in the crisis to comply with regulation faster
and in a more profitable way than smaller banks leads to competitive advantages. First,
the large banks could start earlier to focus on business due to their strategic responses
to regulation. Second, large sources of know-how in internationally active banks allow
them to participate actively in the technological change known as FinTech - a second
game changer in asset management in the next years. Finally, the size of their balance
sheets and revenue opportunities make the big banks almost invulnerable, despite the
many and large fines imposed on them due to a number of scandals in recent years.
Example Impact of Regulation on the Swiss banking sector

The absence of the above mentionded advantages for smaller intermediaries impacts
the structure of regional banking sectors. It is estimated that of the approximately 300
Swiss banks in 2014, about one-third will stop operating as an independent brand. A
KPMG study from 2013 (KPMG [2013]) summarizes:
A total of 23 percent of Swiss banks faced losses in 2012. All of them with AuM
of less than CHF 25 billion.
Non-profitable banks in 2012 were mostly not profitable in previous years too.
Dispersion between successful banks (large and small ones) and non-performing
banks (small ones) is increasing.
The performance of small banks is much more volatile than that of larger ones.
Changes of business model in large banks seem to be successful.
A total of 53 percent of the banks reported negative net new money (NNM).
Many of the regulatory initiatives launched in recent years have been related to asset management and trading. We consider the eurozone. The Alternative Investment
Fund Managers Directive (AIFMD) mainly acts in the hedge fund sector, whereas the
Undertakings for Collective Investments in Transferable Securities (UCITS) are the main
approach for the fund industry. The European Market Infrastructure Regulation (EMIR)
regulates the OTC derivative markets, and the Packaged Retail and Insurance-Based Investment Products (PRIIPS) initiative is responsible for the key information for retail
investors in the eurozone. The EUs Markets in Financial Instruments Directive (MiFID
II) provides harmonized regulation for investment services across the member states of
the EU with one of the main objectives being to increase competition and consumer
protection in investment services. In the US, the Dodd-Frank Act is the counterpart of
many of these European initiatives.
2.5. WHO DECIDES?
67
The regulatory initiatives place greater demands on asset managers and their service
providers. They enforce changes in customer protection, service provider arrangements,
regulatory and investor disclosure, distribution channels, trade transparency and compliance and risk management functions (PwC [2015]).
2.5.1
MiFID II
The directive MiFID II implements the agreement reached by the G20 at the 2009 Pittsburgh summit, in the eurozone and for all non-EU financial intermediaries offering investment products in the eurozone. MiFID II has the following goals:
The creation of a robust framework for all financial market players and financial
instruments.
Improving the supervision of the various market segments and market practices, in
particular OTC financial instruments.
Strengthening market integrity and competition through greater market transparency.
Harmonization and strengthening of regulation.
Improving investor protection.
Limiting the risks of market abuse in relation to derivatives on commodities, in
particular for futures of essential goods.
The main elements of these investor protection themes are:
Inducements. The need to disclose independent versus non-independent status of
advice and the prohibition for discretionary managers and independent advisers to
be involved in inducements.
Product governance. The manufacturers product approval process has to include
the target market definition which has to be taken into account by the distributors
and which has to be tracked by the asset managers.
Suitability and appropriateness. All investment firms operating in EU countries are
required to provide clients with adequate information for assessing the suitability
and appropriateness of their products and services, and to comply with best execution obligations. We note the expanded definition of so-called complex products
which affects the possibility to distribute such products to retail and execution only
customers.
Client information. Enhanced requirements related to information to be shared
with clients, both regarding content and method such as in particular costs and
charges for services or advice.
68
The regulation involves enormous administrative and political work: it requires passing
of 32 acts of law by the European Commission, 47 regulatory standards, 14 performance
standards, and 10 policy packages.
In the eurozone, suitability and appropriateness have to follow client segmentation
and intermediation segmentation (see Figure 2.9). This segmentation applies to all EU
and all non-EU banks offering investment products in the zone.
Figure 2.9: Client segmentation and intermediation segmentation as per MiFID II.
Intermediation Channel Segmentation
Execution only: Investors decide themselves and investment firms only execute
orders. To find out which services are appropriate for an investor using technology
from the investment firm, an appropriateness test is needed.
Advisory: Investors and investment firm staff interact. While relationship managers
or specialists advise the investor, the investment decision is finally made or approved
by the investors themselves. Advisory was the traditional intermediation channel
before the financial crisis of 2007.
Mandate: The investor delegates the investment decision in a mandate. The mandate contract reflects the investors preferences. The portfolio manager chooses
investments within the contracted limits. Many banks and asset managers motivate their clients to switch from the advisory to the mandate channel. The main
reasons for this are lower business conduct risk and better opportunities for automatization. These reduce production costs and enhance economies of scale.
2.5. WHO DECIDES?
69
Investors preferences and skills and the financial situation are the same in all three
channels. But the investment firms skill of knowing which products are suitable and
appropriate varies across the different channels. Also, transparency and profitability are
different for the three different type of intermediation. Many investors will not act in
only one single intermediation channel but will, for example, choose a mixture between
mandate and execution only. This defines a challenge for the financial intermediary, since
its duties and risks are different for different channels. Intermediaries have, for example,
to make sure that they do not advise a client when that client is deciding in an execution
only manner.
Client Segmentation Investment firms must define written policies and procedures
according to the following categorization:
Eligible counterparties such as banks, large corporates, and governments.
Professional clients. A professional client possesses experience, knowledge, and
expertise with which to make his or her own investment decisions and properly
assess the risks thus incurred.
Retail clients (all other clients).
The old-style approach that uses wealth as a single variable for the classification of clients
is no longer applicable. Clients can both opt up and opt down - that is, choose a less
or more severe protection category than the bank itself would define. Suitability and
appropriateness requirements are defined in each cell of the 3x3 segmentation matrix
(Figure 2.9). Client suitability addresses the following six points:
1. Information on clients
2. Information provided to clients
3. Client knowledge and experience
4. Financial circumstances of the client
5. Investment objective
6. Risk awareness and risk appetite
These six points reflect the parameters that define the optimization problem of a
rational economic investor. To determine the preferences of an investor one needs to
have general information about the investor (2.55) and specific risk attitudes (6), which
both enter into the objective function (5). The optimization of the objective function
leading to the optimal investment rule is carried out under various restrictions: the
budget restriction (4) and restrictions of admissible securities due to their complexity or
the experience of the investor (3). Tax issues, legal constraints, and compliance issues
also enter into the restriction set and require information to be provided to the client
70
(2.56). These six points are therefore sufficient for the investor to determine his or her
optimal investment strategy. The implementation of the six points from an economic
perspective is a challenging task.
Example
Consider an investor with the following initial portfolio:
One mutual, actively managed fund on equity and bonds.
Apple stocks and Swiss Government bonds.
Call options on Novartis stock.
How do we measure the risk of this portfolio and how do we evaluate whether the risk
is suitable and appropriate for the investor? Suppose that the investor intends to invest
additionally in a barrier reverse convertible structured product on the S&P 500, SMI,
and Eurostoxx 50. How would this addition change the original portfolio risk profile
objectively, are the risks within the risk acceptance set of the investor and how are the
risk figures perceived by the investor for decision making? Objective portfolio risk is given
by a combination of many different financial products. Different products can differ in
their economic risk profile such as linear payoffs versus non-linear ones, their risk sources
or factors and dynamics and in their transparency such as a single stock versus a mutual
fund which consists of many constituents. The calculation of objective risks over time
as an input for portfolio suitability needs economics and technological, see Chapter 4 for
the latter one. The idea is to represent each financial product as a linear combination of
a set of factors such as equity international, equity factors for different sectors, interest
rate factors for different maturities and currencies, etc. Summarizing, the portfolio value
V of the different security positions is expanded in the basis of the list of factors F :
V =
Positions
X
i=1
i Si =
Positions
X Factors
X
i=1
i i,j Fj + j
j=1
with S the price of the assets, the weights and the idiosyncratic decomposition risk.
Two questions are immediate: How can one expand linearly appropriately a non-linear
payoff of an option into a set of factors? To achieve this, one uses a Delta approximation for any non-linear product. But then, the weights change from day to day if the
underlyings move. Therefore, the weights are calculated on a regular basis by say a stock
exchange such that changes in values of the derivatives are captured adequately. Second,
how can this decomposition be used for risk management? Using the above portfolio
value difference between two consecutive dates allows one to calculate any risk figure
such a value-at-risk, expected shortfall. To achieve this, the return of all factors needs
to be modeled for any investment horizon. But not only total portfolio risk can be calculated; also the individual risk contribution of each position is attainable. This then
shows to the investor the marginal contribution of each position to the total risk figure.
We consider this in more detail in the exercises.
2.5. WHO DECIDES?
71
Product suitability consists of requirements that ensure that the product or service
is suitable for the client:
1. Specific service-/product-related restrictions
2. Adverse tax impact
3. Requirements for prospectuses
4. Disclaimer
These requirements become less demanding the more experienced the client is. Summarizing, suitability means that a pure ad hoc type of advisory without considering
the investors preferences, the investors match to the products and services, and the
investors circumstances is no longer feasible. Suitability in advisory services requires
qualified staff and an appropriate incentive structure in the asset management firm.
2.5.2
Investment Process for Retail Clients
How are the investors preferences elicited, transformed into investment guidelines, and
managed over time for retail clients? Figure 2.10 illustrates an investment process. Given
the clients need, his or her preferences are compared with the CIO view and its transformation into CIO portfolios. This comparison defines the theoretical client portfolio.
Using the securities from the producers the theoretical portfolio is transformed into the
(real) client portfolio. Life-cycle management controls the evolution of the client portfolio
over its life cycle and compares the risk and return properties with the initially defined
client profile. If necessary, this process sends warning or necessary activity messages to
the client and/or advisor. A CIO view typically consists of several inputs such a quantitative model, research macro view and market view. Smaller institutions do not have the
resources to provide all these inputs. They then buy the CIO view from another bank.
Traditionally, intermediaries use questionnaires to reveal investors preferences. This
approach has several drawbacks.
Reliability. It is difficult to test to what extent the investor understands the questions.
Zero emotion. Questions are by definition drawn up in a laboratory environment.
Restricted offering. Due to the low degree of automation, the solutions offered
cannot consider individual preferences on a fine level.
Missing scenarios and consequences.
Life-cycle management, when investment circumstances are changing, is difficult to
handle.
Time and place dependent.
72
Figure 2.10: An investment process. The three channels from left to right are the client
- advisor channel, the investment office, and the producers of the assets or portfolios
(trading and asset management).
Missing economies of scale for intermediaries; lack of control standards.
Current technologies make it possible to use scenario engines to obtain a more reliable
client profile, to generate a more refined portfolio offering, to set up more comprehensive
and real-time life-cycle management (portfolio surveillance; reporting) and to make some
steps in the whole investment process scalable for the intermediary.
New trends in technology allow the process outlined in Figure 2.10 to be shaped. In
extremis, there will be no need for an investor to disclose his or her investment preferences since the data already exist in the virtual world. If, furthermore, the investment
views are formed in a fully automatized manner using publicly available data, then the
function both of advisors and of the CIO will become superfluous.
These approaches fall under the label big data and FinTech. We will discuss the
meaning of these expressions below. For the time being, whatever big data and FinTech
mean exactly, we will discuss two main scenarios.
Example - FinTech and Big Data
2.5. WHO DECIDES?
73
Disruption scenario: First, big data defines a threat to traditional financial

intermediaries since new entrants access to comprehensive data regarding investors,
which allow them to capture the preferences of investors more accurately. We show in
Chapter 3, that optimal investment advice requires knowing how investors value their
present and future consumption. Therefore, financial intermediaries face the risk of
losing the point of sale part of their value chains to firms such as Google, Alibaba, or
one of the many new firms that have become established in this domain, firms that can
integrate investor consumption data into the investment process. In this scenario, which
is referred to as digital disruption, services and products to front-end consumers will
be generated using new technologies, replacing old technologies completely. Accenture
(2015) states that investments in FinTech firms active in this scenario tripled from USD
930 million in 2008 to USD 3 billion in 2013- McKinsey (2015) adds the 2014 figure:
USD 9 billion.
Redesign scenario: The second scenario is called digital re-imagination. Here,
banks or asset managers use new technologies to redesign their work flows: the ownership
of the front end remains within the banking firm. The level of current investment in
the re-imagination channel matches that in the disruption scenario, implying that both
scenarios are still feasible. But the overriding belief in how technology can be used to
redefine asset management value chains has significantly changed in recent years. The
very possibility that the disruptive approach could be successful was still being denied
only a few years ago. Banking in general and asset management in particular were
considered to be too complex, too risky, and too controlled in regulatory terms for new
entrants to succeed.
2.5.3
Mandate Solutions for Pension Funds
This section follows Lanter (2015). Figure 2.11 illustrates the investment decision process
for a pension fund.
The asset liability management (ALM) is the first step where a pension funds typically uses external support by consultants. The outcome of this analysis is a transparent
picture about the present asset and liabilities and how they might change in the future
due to the various risk factors. The fulfilment of the pension funds long term goals based
on the analysis define the strategic asset allocation, that is the allocation which should
be stable through the possible future economic and financial market cycles. The bets,
that is the tactical asset allocation, are the next step to define. Here the pension funds
need to decide whether they delegate the TAA to external portfolio managers in form of
a mandate or whether they keep the asset management inside the fund. Another issue
is the selection of the benchmark and the fixation of risk-based ranges for the tactical
asset allocation. Having decided whether the investment decisions are outsourced via a
mandate or not, one next has to decide whether the functions of reporting, administration, risk-controlling of the investment portfolios should also be outsourced. As in the
74
Figure 2.11: Process for a mandate in a pension fund (Lanter [2015]).

case for the investment decision, request for proposal are used to select the best suited
outsourcing partners. The whole process of investment decision outsourcing is done with
the involvement of external consultants. Goyal and Wahal (2008) estimate that 82 percent of US public plan pension funds use investment consultants.
We discuss in Section 4.12 that the extensive use of investment consultants raises is
by no means free of conflicts for the performance of the delegated investments and for
the selected asset managers. Critics for example often make them the accusation to be
drivers of new investment strategies which turn out to be more complex (hence more
difficult to handle, understand and also more expensive) than the actual used ones but
where it is not clear whether they lead to a larger performance.
The other steps in the process, as illustrated and described in the last figure, are
evident.
2.5.4
Conduct Risk
The largest risk for investment firms is conduct risk in the investment process. Conduct
risk comprises a wide variety of activities and types of behavior that fall outside the other
main risk categories. It refers to risks attached to the way in which all employees conduct themselves. A key source of this risk is the difficulty of managing information flows,
their impact, their perception, and responsibilities in an unambiguous way. Consider
an execution-only investor who does not understand a particular statement in a given
2.5. WHO DECIDES?
75
research report. Can the relationship manager help the execution-only investor without
entering into conflict with his or her execution-only status - that is, help without advising? To hedge their conduct risk sources investment firms are forced to work out detailed
and well-documented processes concerning the information flow between themselves and
the customer. While this paper work may be effective as a hedge against conduct risk,
its efficiency is questionable.
Example
The Financial Stability Board (FSB) stated in 2013: One of the key lessons from
the crisis was that reputational risk was severely underestimated; hence, there is more
focus on business conduct and the suitability of products, e.g., the type of products sold
and to whom they are sold. As the crisis showed, consumer products such as residential
mortgage loans could become a source of financial instability. The FSB considers the
following issues key for a strong risk culture:
Tone from the top: The board of directors and senior managers set the institutions core values and risk culture, and their behaviour must reflect the values being
espoused.
Accountability: successful risk management requires employees at all levels to understand the core values of the institutions risk culture. They are held accountable
for their actions in relation to the institutions risk-taking behaviour.
Effective challenge: a sound risk culture promotes an environment of effective challenge in which decision-making processes promote a range of views, allow for testing
of current practices, and stimulate a positive, critical attitude among employees and
an environment of open and constructive engagement.
Incentives: financial and non-financial incentives should support the core values
and risk culture at all levels of the financial institution.
Conduct risk is a real source of risk for investment firms: fines worldwide amounted
to more than USD 100 billion for the period 2009-2014. These fines and the new regulatory requirements raise serious profitability concerns for investment firms and banks (see
Figure 8). But there is more than just financial costs at play for the intermediaries. A
loss in trust in large asset managers and banks can prove disastrous. In particular if new
entrants without any reputational damage can offer better services thanks to FinTech.
Example - Fines in the UK

Figure 2.12 shows the evolution of the fines imposed by the British regulatory authorities.
76
Figure 2.12: Table of fines imposed in the UK (FSA and FCA web pages).
In the US, enforcement statistics from the Securities and Exchange Commission
(SEC) show an increase in enforcement actions in the category investment advisor/investment company of roughly 50 percent following the financial crisis of 2007. Compared
to the pre-crisis figures of 76 and 97 cases per year, respectively, 2011-2014 returned
respective figures of 130 and 147 cases.
Example - Hedge fund disclosure

Patton et al. (2013) show that disclosure requirements for hedge funds are
not sufficient to protect investors. The SEC for example requires US-based hedge
funds managing over USD 1.5 billion to provide quarterly reports on their performance, trading positions, and counterparties. The rule for smaller hedge funds are less
detailed. Instead, one has to care seriously about the quality of the information disclosed.
We consider monthly self-reporting of investment performance where thousands of
individual hedge funds provide data to one or more publicly available databases which
are then widely used by researchers, investors, and the media.
Are these voluntary disclosures by hedge funds reliable guides to their past performance? The authors state:
2.6. RISK, RETURN, AND DIVERSIFICATION
77
... track changes to statements of performance in vintages of these databases

recorded at different points in time between 2007 and 2011. In each such vintage, hedge
funds provide information on their performance from the time they began reporting to
the database until the most recent period.
Vintage analysis refers to the process of monitoring groups and comparing performance across past groups. These comparisons allow deviation from past performance to
be detected. The authors find
that in successive vintages of these databases, older performance records (as far back
as 15 years) of hedge funds are routinely revised: nearly 40 percent of the 18, 382 hedge
funds in the sample have revised their previous returns by at least 0.01 percent at least
once, and over 15 percent of funds have revised a previous monthly return by at least
1 percent. These are very substantial changes, given the average monthly return in the
sample period is 0.64 percent.
Less than 8 percent of the revisions are attributable to data entry errors. About
25 percent of the changes were based on differences between estimated values at the
reporting dates for illiquid investments and true prices at later dates. Such revisions
can be reasonably expected. In total, 25 percent (50%) of the revisions relate to returns
that are less than three months old (more than 12 months old). They find that negative
revisions are more common, and larger when they do occur than positive ones. They
conclude that on average initially provided returns signal a better performance compared
to the final, revised performance. These signals can therefore mislead potential investors. Moreover, the dangerous revision patterns are significantly more likely revised for
funds-of-funds and hedge funds in the emerging-markets style than for other hedge funds.
Can any predictive content be gained from knowing that a fund has revised its history
of returns? Comparing the out-of-sample performance of revising and non-revising funds,
Patton et al. (2013) find that non-revising funds significantly outperform revising funds
by around 25 basis points a month.
2.6
Risk, Return, and Diversification
The first step toward investment theory is to gain insights into the interplay between
risk, return, and diversification without relying on a particular investment model. We:
show on an ad hoc basis when a portfolio is more than the sum of the parts - that
is, more return and less risk;
analyze the long-term performance of investments before and after costs;
consider risk scaling;
78
discuss two proposition from statistics concerning diversification;
introduce to diversity and concentration risk;
show how fees impact long-term returns;
introduce the debate between active and passive management.
We start with a fact:

Fact 2.6.1. For every rational investor, which is not risk neutral, risk and returns cannot
be optimally considered as unrelated topics.
Why is this true? If the investor is not risk neutral, then his preferences for wealth or
money are a non-linear function. In the optimality condition, marginal utility of wealth
enters which is also not a linear function of wealth. Approximation this expression using
a Taylor series, not not only the first moment (returns) matters for the investors but also
higher moments (risk or loss aversion for example). Therefore, an optimal investment
strategy will link returns and risks.
2.6.0.1
Long-term Risk and Return Distribution
Table 2.4 shows the risk and return distribution and the wealth growth for the long
period 1925-2013 for certain asset classes (Kunz [2014]).
Stocks USA
Stocks CHF
Stocks DEU
Stocks GBR
Stocks FRA
Stocks JPN
Stocks ITA
Bonds CHF
Bonds GBR
Bonds USA
Bonds FRA
Bonds ITA
Bonds JPN
Deposit CHF
Gold
Investment of CHF
100 after 88 years gives
71,239
70,085
44,669
34,619
18,939
5,367
2,552
3,611
1,880
1,196
212
195
57
1,070
1,052
Return
Average annual return
7.75%
7.73%
7.18%
6.87%
6.14%
4.63%
3.75%
4.16%
3.39%
2.86%
0.86%
0.76%
-0.64%
2.73%
2.71%
Risk
Standard deviation
23.50%
19.30%
41.30%
25.30%
29.20%
29.80%
28.30%
3.70%
12.70%
12.50%
15.00%
20.40%
21.20%
1.20%
15.80%
Table 2.4: Average annual returns and standard deviations of the asset classes and growth
of capital after 88 years. The calculation logic being 71, 239 = 100(1 + 0.075)88 .
79
The Figure 2.13 shows the distribution of return and risk, measured by the standard
deviation, over 88 years of investments.
Return and standard deviation, 19252013

9.00%
8.00%
Averrage annual returns
7.00%
6.00%
5.00%
4.00%
3.00%
2.00%
1.00%
0.00%
-1.00%
-2.00%
0.00%
5.00%
10.00%
15.00%
20.00% 25.00% 30.00%

Standard deviation
35.00%
40.00%
45.00%
Figure 2.13: The distribution of return and risk, measured by the standard deviation,
over 88 years of investments. The square marks represent equity, the diamonds bonds,
the triangle is cash, and the circle is gold (data from Kunz [2014]).
We observe that in the long run equity had in most economies higher returns and
risks than its bond counterparts. We discuss in Chapter 3 why nevertheless an advice
to invest in stocks only if the investor has a long-term horizon is not an optimal straegy.
Furthermore, a small difference in the average return creates a large difference in the
wealth accumulation. This shows the compounding effect. Finally, gold has in this long
period a large risk component but only a small average return.
This analysis allows us to consider diversification, risk scaling (how risk for a given
time horizon is transformed into a risk figure for a different time horizon) and cost and
performance issues.
2.6.0.2
Diversification of Assets - Portfolios
Can we combine different investment classes to form a portfolio with higher return and
lower risk than the individual asset classes? This is the diversification question. If
there is a positive answer, the next question will be whether there is an optimal way of
diversifying the investment: How are the portfolio weights best chosen?
80
The drawback of considering diversification is the significant increase in complexity

in portfolio management since dependence between the assets matters: How do events
impact both asset 1 and asset 2? If dependence is considered by using statistical correlation, several problems arise. First, the strength of the dependence varies over time both
within asset classes and across asset classes. Second, if one has to estimate dependence
as model input, it can be hard to obtain estimates that are robust and they can be
mis-specified. We apply diversification to the data in Table 2.4 using an ad hoc portfolio
construction approach: the weights are not optimally chosen using a statistical model
but are fixed based on heuristics (experience).
We form four portfolio strategies - so-called conservative, balanced, dynamic, and
growth - in Table 2.5.
Equity
CH
Rest of world total (six countries)*
Rest of the world per country
Bonds
CH
Rest of world total (six countries)*
Rest of the world per country
Conservative
25%
10%
15%
2.5%
75%
66%
9%
1.50%
Strategy
Balanced Dynamic
50%
75%
20%
30%
30%
45%
5%
7.50 %
50%
25%
44%
22%
6%
3%
1%
0.50%
Growth
100%
40%
60 %
10%
0%
0%
0%
0%
Table 2.5: Investment weights in four investment strategies (data from Kunz [2014]).
*Investment in DEU, FRA, USA, GBR, ITA, and JPN
Using data from Figure 2.13 for the different asset classes, we then get the returns in
Table 2.6.
Conservative
Balanced
Dynamic
Growth
Investment
100 CHF after 88 years gives
143,131
76,949
33,318
11,702
Return
Average annual return
8.61%
7.84%
6.82%
5.56%
Risk
Standard deviation
19.80%
15%
10.40%
6.30%
Table 2.6: Average annual return, risk, and wealth growth for the four investment strategies.
Figure 2.14 shows that a combination of risk and return figures of basic asset classes
can lead to a portfolio from which more return can be expected for the same risk or less
81
risk for the same return. The green marks for the investment strategies form a virtual
boundary line. In fact, the Markowitz model implies that there is a so-called efficient
frontier such that there can be no portfolio construction with more return and lower risk
than any portfolio on the efficient frontier.
Figure 2.14: Distribution of return and risk, measured by the standard deviation, over
88 years of investments. The square marks represent equity, the diamonds bonds, the
triangle is cash, and the circle is gold. The dots represent the four investment strategies
- conservative, balanced, dynamic, and growth (data from Kunz [2014]).
Two conceptual questions regarding diversification are:
What are the risks of not diversifying?
When does diversification make little sense?
Consider the first question. Often employees own many stocks of their employer directly
or indirectly in their pension scheme. Such stock concentration can be disastrous. Enron
employees for example had over 60 percent of their retirement assets in company stock.
They then faced heavy losses when Enron went bankrupt. Diversification reduces these
idiosyncratic risks. Concentration risk does not depend on the size of the firm: Of the
500 companies in the S&P 500 index in 1990, only half remained in the index in 2000
(J.P. Morgan [2012]).
Institutional investors also fail to diversify sufficiently. The University of Rochesters
endowment in 1971 was USD 580 million, placing it fourth in the respective ranking of
private universities. In 1992, it ranked twentieth and by 2011 had dropped to thirtieth
82
place. One of the main reasons for this underperformance was the excessive concentration
held in Eastman Kodak, which filed for bankruptcy in February 2012. Boston University
invested USD 107 million in a privately held local biotech company in the 1980s. The
firm went public and suffered a setback. In 1997, the universitys stake was worth only
USD 4 million.
The Norwegian sovereign wealth fund, in contrast, was created precisely to reap the
gains from diversification. The fund swapped the highly concentrated oil revenues into
a diversified financial portfolio.
If an investor is confident about a specific investment, then diversification is of little
value to that investor:
Diversification is protection against ignorance. It makes little sense if you know what
you are doing. Warren Buffet
In these cases diversification unnecessarily reduces the return potential of an investment.
Why does diversification exert an undeniable attraction to investors? The returns on
a stock depend on anticipated and unanticipated events. While anticipated events are
incorporated into market prices, most of the return ultimately realized will be the result
of unanticipated events. Investors do not know their timing, direction, or magnitude.
Hence, they hope to reduce the risk by diversifying their investment. Investors which
diversify therefore consider asset returns to be not predictable to a large extend.
But diversification does not only reduce the risk of portfolios; the hope is that it
also reduces the complexity of risk management. To understand this, we consider an
investment in many assets. Events can either affect the risks on a firm-specific level (idiosyncratic risk) or the whole portfolio of assets (systematic or market risk). If the many
idiosyncratic risks compensate for each other, they leave portfolio risk equal to market
risk. Then, only investment in the systematic risk component should be rewarded. and
the investor has only to consider a single market risk factor, which is much simpler to
manage than the many idiosyncratic risk sources of corporate firms assets.
Summarizing, while diversification increases complexity by introducing the need to
quantify dependence between assets, the consideration of a diversified portfolio also decreases complexity by reducing the many idiosyncratic risk source to a small number of
systematic risks.
2.6.0.3
Two Mathematical Facts About Diversification
The following two statistical facts describe how asset diversification can impact portfolio
risk characteristics.
83
Proposition 2.6.2. Assume that the N asset returns in a portfolio are not correlated
and that investment is equally weighted (EW) - that is, k = 1/N for all assets k are the
relative weights. Increasing the number of assets N reduces portfolio risk p2 arbitrarily
and monotonically.
The assumption of an EW investment is not necessary but facilitates the proof - the
statement holds for any portfolio. This shows that to eliminate portfolio risk completely
in an portfolio with uncorrelated returns, one only has to increase the number of assets
in the portfolio. The proof follows from the fact that the variance of the sum is equal to
the sum of the variances since there is no covariance:
N
N
X
X
1
1
Nc
p2 = var
Rj = 2 var
Rj 2
N
N
N
j=1
j=1
with c the largest variance of all N assets. If assets are correlated to each other, which is
the case in reality, the above result changes as follows.
Proposition 2.6.3. Consider an equally distributed portfolio strategy 1/N . The portfolio
variance is equal to the sum of market risk and idiosyncratic risk. The latter can be fully
diversified away by increasing the number N of assets. The market risk can only be
reduced to the level of the average portfolio covariance cov.
The proof is only slightly more complicated than the former proof, and leads to the
result:
var
1
p2 =
+ (1 )cov .
N
N
By increasing the number N of assets, the average portfolio variance var can be made
arbitrarily small - the portfolio variance is determined by the average covariance. But
the average portfolio covariance approaches a non-zero value. Hence, covariances prove
more important than single asset variances in determining the portfolio variance. Taking
the derivative of the portfolio variance w.r.t. the number of assets N , the sensitivity
becomes proportional to N12 . Adding to N = 4 a further asset reduces portfolio risk
1
1
by 25
, adding another asset to 9 assets the reduction is only 100
. Therefore, reducing
portfolio risk by adding new assets becomes less and less effective the larger the portfolio
is.
2.6.0.4
Time Varying Dependence
Covariance impacts portfolio risk p2 measured by the variance for two random variables
(two asset case) as follows:
p2 (R1 + R2 ) = 2 (R1 ) + 2 (R2 ) + 2cov(R1 , R2 ).
This shows that co-movements do matter for portfolio risk and that risk is not additive,
contrary to return. Since the covariance is not a bounded number, one often prefers
84
to work with the correlation , which is the covariance of two risks normalized by the
standard deviation of two risks. But assets are not only correlated, their correlation is
also not stable over time. Different causes of market turmoil lead to a different correlation
pattern between the asset classes (see Figure 2.15).
Figure 2.15: Pair-wise correlations over time for different asset classes (Goldman Sachs
[2011]).
The main question is: Can we predict the time variations of the different correlations?
Example Two Asset Case

For the fraction of wealth invested in asset 1 and the remainder in asset 2, we get
p2 = 12 2 + 22 (1 )2 + 22 1 (1 ).
(2.41)
This shows that portfolio risk becomes additive, however, if at least one asset is risk free
or if the assets are not correlated. A negative correlation value reduces portfolio risk and
the opposite holds for a positive correlation. This motivates the search for negatively
correlated risks.
If correlation takes the extreme value 1, portfolio risk becomes a complete square
and can be eliminated completely even if the two assets are risky by solving the equation
p2 = 0. Contrary, if correlation is maximal, +1, portfolio risk is maximal.

2.6.0.5
85
Reasonable Diversification and Needed Investment Amount for Diversification
If one wishes to invest in a diversified portfolio Elton and Gruber (1977) show that the
individual risk of stocks could be reduced from 49 percent to 20 percent by considering
20 stocks. Adding another 980 stocks only reduces risk further to 19.2 percent. This
show that diversification indeed can lower risk but that the effect of adding more and
more assets has a diminishing impact on risk.
How much wealth is needed to achieve a diversification in 20 securities? Given the
average price of stocks and bonds in Swiss francs - similar calculations apply to other
currencies - the amount invested in one security should be around CHF 10,000. Lower
investments are not efficient. Therefore, one needs CHF 200,000 for a pure equity portfolio of Swiss stocks. Diversifying this portfolio, say to US, European, and Asia-Pacific
stocks requires an investment of CHF 0.8 million. If the portfolio should be a mixture
of bonds and equities, say 50/50, then the amount needed for diversified single security
investments is CHF 1.6 million.
Hence, only wealthy individuals can invest directly in cash products such as stocks
and bonds to generate a sufficiently diversified portfolio. This is a rationale for the
existence of ETFs, mutual funds, or certificates, which offer a similar diversification level
to less wealthy clients as well.
2.6.0.6
Concentration and Diversity
The attentive reader has remarked that we have not defined the notion of diversification. The reason for this is that a single, precise, and widely accepted definition does
not exist. Among existing concepts are the diversification index of Tasche (2008), the
concentration indices of Herfindahl (1950) and Gini (1921), and the Shannon entropy
(Roncalli [2014]), which measures diversity; see Roncalli (2014) for a detailed discussion.
Tasches diversification index
The diversification index of Tasche (2008) is the ratio between the risk measurement of
a portfolio and the weighted risk measurement of the assets. If one specifies the risk
measure to be the volatility, the diversification index reads
0
C
Diversification Index =
,
(2.42)
0
where is the vector of volatilities. The numerator is equal to the portfolio risk term in
the Markowitz model (2.55). The index takes values not larger than 1. It is equal to 1 if
all assets are perfectly correlated.
Herfindahls concentration index
Consider the relative weight vector of a long-only portfolio - that is, the positive weights
86
add up to one. Therefore, the weights are probabilities. Maximum concentration occurs
if one weight has the value one and all other weights are zero. Risk concentration is
minimal if the portfolio weights are equally weighted. The Herfindahl index is defined by
Herfindahl Index =
N
X
2k .
(2.43)
k=1
It then takes the value 1 in the case of maximum concentration and 1/N in the equal
weight portfolio case.
Shannon entropy diversity measurement
The Shannon entropy S for a relative weight long-only portfolio vector is defined by
S() =
N
X
k ln k .
(2.44)
k=1
To understand the motivation of the entropy measurement, consider two dies - one symmetric and the other distorted. The outcome for the symmetric die is more uncertain
than that of the other die. Shannon formalized this notion of uncertainty in the 1940s
in the context of information theory. He proved that there exists only the function S()
above, which satisfies eight axioms describing uncertainty. One axiom is, for example,
that the function S has to assume the maximum value if all probabilities k are the same
- this is the case of maximum uncertainty.
Reconsider tossing an arbitrary coin. The entropy of the unknown result of the next
toss is maximized if the coin is fair. This reflects that for a fair coin the most uncertain
situation to predict the outcome of the next toss follows: The more unfair a coin, the
less uncertainty is.
Example Entropy
To get a feeling regarding entropy, consider first the natural sciences, more precisely
the law of thermodynamics. The following observation would be possible if one considers
only the energy of physical systems:
The air in your office could contract into one small area of the room spontaneously.
A dissolved sugar cube in your coffee might spontaneously pull back together in a
part of the coffee.
A dropped stone might spontaneously transform its own thermic energy into kinetic energy and climb again; such a spontaneous cooling of the stone followed by
climbing would not violate the conservation of energy law.
87
Entropy makes these events impossible since each of them would mean a reduction in
disorder, and nature minimizes energy and maximizes entropy (a measure of disorder).
In finance one often needs to measure how close different probability laws are to each
other - say, for example, the prior distribution and the posterior distribution in the BlackLitterman model. But the space of probability laws is just a set and it is not trivial to
find a reasonable measuring stick - consider the three following normal distributions:
Distribution 1 has mean 0.1 and variance 0.2.
Distribution 2 has mean 0.05 and variance 0.3
Distribution 3 has mean 0.2 and variance 0.1.
How close are these distributions? The relative entropy S(p, q), also called the KullbackLeibler divergence, for two discrete distributions p and q, which is defined by
X
pk
S(p, q) =
pk ln ( ),
(2.45)
qk
k
measures the similarity of two probability distributions. This is not a metric since it
is not symmetric - that is to say, interchanging the roles of p and q implies a different
entropy value. The relative entropy has the following properties:
S is never negative.
The divergence can be used as a measure of the information gained in moving from
a prior distribution to a posterior distribution.
If p and q are the same, then S is zero.
Roncalli (2014) illustrates the different notions of diversification. There are 6 assets
with volatilities of 25%, 22%, 14%, 30%, 40%, and 30%, respectively, with asset 3 having
the lowest volatility. The correlation coefficient is equal to 60% between all assets, except
between the fifth and sixth where it is 20 percent - that is, the correlation matrix reads
100%
60% 100%
60% 60% 100%
60% 60% 60% 100%
60% 60% 60% 60% 100%
60% 60% 60% 60% 20% 100%
Since the correlations are symmetric by definition, one only needs to display half of
the outer-diagonal elements. Therefore, if one considers dependence using the second statistical moments - covariance and correlation - there is no direction between the causes
88
and effects of the dependence. This simplifies the analysis for investment essentially.
But in other aspects where financial risk matters, such as the Great Financial Crisis,
the causes and effects of risk dependencies are essential. Phenomena such as financial
contagion cannot considered appropriately by using correlations.
The following are calculated: the global minimum variance (GMV), the equal risk contribution (ERC), the most diversified (MDP), and the equal weights (EW) portfolios.
The GMV portfolio is the Markowitz optimal solution in (2.55) with minimal risk.
The EW portfolio assumes the same dollar weight of 61 percent for each asset. The MDP
portfolio minimizes the diversification index of Tasche. ERC is the portfolio in which
the risk contribution of all six assets is set equal to 16.67 percent - the same risk weight.
The risk contribution of asset j to the portfolio risk is by definition the sensitivity of
portfolio risk w.r.t. to j times the weight j . The sensitivity term is referred to as the
marginal risk contribution M RC. The so-called Euler Allocation Principle states when
the sum of the risk contributions for all assets equals the portfolio risk.
Proposition 2.6.4. Let f be a continuously differentiable function on a open subset of
Rn . If f is positive homogeneous of degree 1, this means tf (u) = f (tu) for t > 0, then
f (u) =
n
X
uk
k=1
f (u)
, u Rn .
uk
(2.46)
Volatility and VaR risk measures are homogeneous of degree 1. Applying the Euler
Theorem to risk measures means:
X R() X
R() =
j
=
RCj () .
(2.47)
j
j
For the volatility risk measure this means:

R() = p () =
X
j
(C)j
R() X
=
j 0
j
C
(2.48)
where (C)j denotes the j-th component of the vector C.

The MDP portfolio minimizes the diversification index of Tasche. The weights are
determined by maximizing the ratio of portfolio of volatility to volatility of portfolio. It
is sensitive to the covariance matrix, leads to high concentrated positions and risks and
therefore, often constraints are inserted. It is an optimal strategy when all assets have
the same Sharpe ratio, where:
Definition 2.6.5. The Sharpe ratio is defined as the excess return of the risky investment
over risk free divided by the volatility of the investment.
Roncalli provides us with the results in Table 2.14 where j , RCj are expressed in
percentage values.

Asset
1
2
3
4
5
6
Portfolio
Tasche index
Gini index
Herfindahl index
GMV
j
RCj
0
0
3.61
3.61
96.39 96.39
0
0
0
0
0
0
13.99
0.98
0.82
0.82
0.92
0.92
ERC
j
RCj
15.7 16.67
17.84 16.67
38.03 16.67
13.08 16.67
10.86 16.67
14.49 16.67
19.53
0.8
0.17
0
0.02
0
89
MDP
j RCj
0
0
0
0
0
0
0
0
42.86
50
57.14
50
26.56
0.77
0.69 0.67
0.41
0.4
EW
j
RCj
16.67 16.18
16.67 14.08
16.67
8.68
16.67 19.78
16.67 24.43
16.67 16.86
21.39
0.8
0
0.16
0
0.02
Table 2.7: Comparison of the global minimum variance (GMV), equal risk contribution (ERC), most diversified (MDP), and equal weights (EW) portfolios. All values are
percentages (Roncalli [2014]).
Since correlation is uniform, but for one asset, it does not matter in the GMV allocation. Therefore, the GMV optimal portfolio picks asset 3 with the lowest volatility.
The difference in the correlation between asset five and six does not has a measurable
impact on the portfolio selection. The GMV portfolio is heavily concentrated, which is
not acceptable to many investors. Portfolio risk measured by GMV is the smallest, which
comes as no surprise.
The MDP, on the other hand, focuses on assets 5 and 6, which are the only ones that
do not possess the same correlation structure as the others. Contrary to GMV, MDP
is attracted by local differences in the correlation structure. The diversification index is
lowest for the MDP. If we consider this index as the right diversification measurement,
the MDP portfolio should be chosen. If we consider the concentration measures of Gini
and Herfindahl, the EW should be considered if the investor wishes to have the broadest
weight diversity and the ERC if risk concentration is the appropriate diversification risk
measurement for the investor.
Table 2.8 shows that a seemingly well-diversified portfolio in terms of capital is in
fact heavily equity-risk concentrated.
This fact is often encountered in practice: Equity turns out to be the main risk
factor in many portfolios. But then capital diversification is a poor concept from a risk
perspective.
Example
The asset allocation of Europeans asset managers was in 2013 (EFAMA (2015)):
90
Asset class diversification
Cash
2%
Real estate
Domestic equities 14%
Hedge funds
IEQ
8%
Private equity
EM equities
4%
Venture capital
Domestic govt bonds
9% Natural resources
Distressed debt
ICB 10%
17%
10%
5%
9%
8%
4%
Risk allocation
Cash
2%
Equity 79%
Commodity
8%
CCR 10%
Other
4%
Table 2.8: Asset class diversification and risk allocation. The first two columns contain
the diversification using the asset class view. The third column shows the result using
risk allocation. While the investment seems to be well diversified using the asset classes
the risk allocation view shows that almost 80% of the risk is due to equity. IEQ means
international equities, ICB means international corporate bonds, CCR corporate credit
risk.
43% bonds;
33% equity;
8% cash and money market instruments;
16% other assets (property, private equity, structured products, hedge funds, other
alternatives).
The allocation has been fairly stable in the past except in the GFC where equities lost
massive value. This average allocation significantly differ from an individual country
perspective. UK for example has investment in the equity class between 46% and 52%
in the past while in France the same class is around 20%. This allocation difference
is due to differences in preferences of home-domiciled clients and the large differences
in cross-border delegation of asset management. The ratio of AuM/GDP in UK is for
example 302% which shows the importance of UK as the leading asset management
center of Europe with a strong client basis outside of the UK. Comparing the asset
allocation for investment funds and discretionary mandates the following differences can
be observed. The bond allocation is 28% in investment funds and 58% in the mandates
and equities have a share of 39% in the funds and 26% in the mandates.
Summarizing, either self-deciders (advisory channel) are less risk averse than those
who delegate the investment decisions or the whole process of preference elicitation is
flawed in the financial industry.

2.6.0.7
91
Anomalies
Analyzing empirically the risk and return properties of assets either in the cross-section
or as a time series one encounters patterns that are not predicted by a central paradigm
or theory. Such patterns are called anomalies. Examples are:
Value effect. Low price-to-book (P/B) stocks - called value stocks - typically outperform high P/B stocks (growth stocks).
Size effect. Smaller stocks typically outperform larger stocks.
Momentum effect. Stocks with high returns over the past 12 months typically continue to outperform stocks with low past returns, see Figure 2.16 for an illustration.
Accruals and issuances effect: Stocks with high past accruals and/or recent stock
offerings typically underperform stocks with low past accruals and no stock offerings.
Jan
Feb
Screen
Mar
Apr
Wait
May
Jun
Jul
Aug
Sep
Oct
Nov
J=3 K=3
Buy / Sell
Formation
Period
Skip 1
Month
Holding
Period
Figure 2.16: We assume that stocks are screened based on their past return over the
last J = 3 months, where also J = 6, 12 month are used. This screening identifies the
past winners and losers and defines the formation period. After this identification, no
action is taken for one month. The reason is to filter out any possible erratic price
fluctuations in the past winners and losers selection portfolio. Finally, in the holding
period the selected stocks are hold for K = 3 months where again longer holding periods
are possible. Afterwards the positions are closed. This procedure is repeated monthly
which leads to an overlapping roll-over portfolio allocation.
92
These empirical observations are the starting point for so-called factor investing where
one constructs strategies based on the anomalies which should deliver a better risk/return
reward than investment models such as the CAPM which are not incorporating anomalies.
The key question for investors is: How sustainable are the investments based on the
anomalies?
2.6.0.8
Diversification of Minds - Wikifolios
Wikifolio is a type of investment referred to as social trading. Contrary to the approach

in which the CIO determines tactical asset allocation, a wikifolio investment is based on
the interaction between many investors with many portfolio managers.
Any person can act as a portfolio manager or trader on the wikifolio platform. A
portfolio manager can use a rule-based approach or decide on a discretionary basis. There
are 2016 more than 8,500 published strategies on the www.wikifolio.com platform. An investor can choose to invest in one or several of the 3,100 investable strategies out of these
8,500. This is achieved by buying structured notes at the stock exchange in Stuttgart.
The platform started in June 2012 and by July 2015 the invested capital amounted to
EUR 400 million. Wikifolio certificates have the largest market shares at the Stuttgart
exchange and two or three products are ranked among the 10 most traded products each
month.
An investor therefore can choose between a myriad of investment ideas, which is
the polar opposite of a CIO-approach. To help investors find investments, the platform
publishes different ranking tables and provides investors with a range of other information
about the risks, performance, liquidity, and style of the different strategies. Needless to
say, without recent technological developments wikifolio-style investment would not be
possible.
2.6.1
Risk Scaling
Is it possible to calculate, using the calculated risk in a given investment period, the risk
for a different period without needing further data, running simulations or developing a
new risk model?
Such a rule would be very helpful. Suppose that the risk figures are given on a oneyear time horizon and that one needs to have the risk on a five-year basis.
The existence of such a rule depends on the nature of the returns. If one assumes
that returns are independent and identically normally distributed (IID) with zero mean,
then the square-root of time rule can be used to scale volatilities and risk. Consider an
investment where risk is measured by the standard deviation with two different holding
periods t < T. The volatility for the T -period follows from the t-period volatility by the
93
square-root scaling law

(T ) = (t)
p
T /t .
(2.49)
Since the returns are IID, the variance of a sum of n returns is equal to n times the
variance of a single return:
2 (R1 + . . . + Rn ) = 2 (R1 ) + . . . + 2 (Rn ) = n 2 (R)
where we assumed first no autocorrelation and then that the variances are the same. This
justifies the rule. For an asset with a one-day
p volatility of 2%, the monthly volatility assuming 20 trading days - is equal to 2%x 20/1 = 8.9%. The square-root rule provides
a simple solution to a complex risk scaling problem. The method fails in any of the
following situations:
Modelling volatility at a short horizon and then scaling to longer horizons can
be inappropriate since temporal aggregation should reduce volatility fluctuations,
whereas scaling amplifies them.
Returns in short-term financial models are often not predictable but they can be
predictable in longer-term models. Applying the scaling law one connects the
volatility in two time domains that are structural different.
The scaling rule does not apply if jumps occur in the returns.
If returns are serially correlated, the square-root rule needs to be corrected (see
Rab and Warnung [2011] and Diebold et al. [1997]).
Example - Distribution of annual returns versus distribution of final wealth
We consider the return and risk data shown in Table 2.4. An increasing investment
horizon reduces the volatility of the average annualized returns due to the square-root
rule (2.49).
If annual volatility is 20%, the annualized volatility after 10 years is 6% =
20%/ 10, 3% after 50 years, and 2% after 100 years. This decreasing volatility implies
that the returns are more and more concentrated around the constant average return. If
we assume an average return of 6.93%(Kunz [2014]), after ...
... 1 year, 95% of the returns lie between -32% and 46%.
... 10 years, 95% of the returns lie between -5.5% and 19.33%.
... 50 years, 95% of the returns lie between 2.4% and 11.4%.
If we consider the cumulated total return - the final wealth
distribution - the
situation changes. The 20% after 1 year becomes 200% = 20 100% after 100 years.
Therefore, although an investment of 100 today takes the expected value of 102, 249
after 100 years, assuming continuous compounding, the distribution of the final wealth
94
return is in 95 percent of all cases scattered between approximately 2, 000 and 6.4
million. Hence, the volatility of the final wealth return increases with an increasing time
horizon.
Summarizing, if cumulated total wealth return volatility increases over time - future
wealth scatters, but average annualized wealth return volatility decreases over time returns become concentrated.
2.6.2
Long Term Investment and Retirement Risk
The wealth growth in Figure 2.14 indicates that for very long time horizons equity investments outperform bond investments: If time horizons increase, do equity investments
dominate bond investments.
One needs to be careful with such statements. Consider private investors. They face
a life cycle where after a given date they stop accumulating wealth via labor income. If
the wealth of retired individual suffers a heavy loss near to retirement date, there will be
no resources from income to restock the fortune.
This can have a disastrous effect on private clients wealth. Davis (1995) reports that
Britons who retired in 1974 and had contribution-based pension plans without a minimum
guarantee received an income for the remainder of their lives that was worth only half
that received by individuals who retired before the 1973 shock, say in 1972.
Vignola and Vanini (2008) analyze this retirement risk in an overlapping generation
context. They assume that individuals of each generation start saving 20 years before
they retire. The first generation starts to save in 1927 and the final one in 1983. They
compare a risk-free investment with an annual risk-free rate of 4 percent and a risky
investment in a basket of all stocks on the NYSE, Nasdaq, and AMEX. Calculating
the average growth rate of wealth for each generation up to the time of retirement, two
observations follow. First, due to booming stock markets in the 90s, individuals who
started investing in the stock market in the 70s outperformed by a wide margin those
individuals who invested in risk-free assets in the same period. Contrary, individuals who
retired in the 70s (oil shock) and had invested in stocks in the 50s underperformed the
risk-free investment.
This shows that for employees the retirement date is of particular importance. They
face considerable timing risk. This risk cannot be diversified away at a given point in
time since the markets do not offer assets for transferring these long-term risks (markets
are incomplete). But intermediaries who, themselves, do not face such long-term risk
could smooth this risk between different generations of employees. Pension funds with
defined benefit plans are an example of such an intermediary; see Allen and Gale (1997)
95
for details.
2.6.3
Costs and Performance
The risk, return, and performance analysis of the different asset classes has not yet
considered market frictions at all. There are no fees, no taxes, and no bid-ask spreads.
What is the impact of such costs on the performance outlined in Figure 2.14? We take
Swiss stocks with a gross average return of 7.73 percent and assume (Kunz [2014]):
A total of 25 percent of the return arises from dividends, which face a taxation rate
of 30 percent,
The long-term inflation rate is 2 percent,
Investments can be via an investment fund (mutual fund, SICAV) with annual
costs of 1.5 percent, or an index fund with annual costs of 0.5 percent.
The returns using these figures are given in Table 2.10 (Kunz [2014]).
Market index
Investment fund
Index fund
... Fees
7.73%
6.23%
7.23%
Return after ...

... Fees and taxes ... Fees, taxes, and inflation
7.15%
5.15%
5.65%
3.65%
6.65%
4.65%
Table 2.9: Returns after Fees (Kunz [2014]).

Given these net returns, an investment of CHF 100 takes, after 25 years, the values
shown in Table 2.10 (Kunz [2014]).
Market index
Investment fund
Index fund
... Fees
643
453
573
Value of CHF 100 after 25 years ...

... Fees and taxes ... Fees, taxes, and inflation
562
351
395
245
500
312
Table 2.10: Net growth of wealth (Kunz [2014]).

Compared to the market index, the wealth level using an investment fund is 41 percent lower after 25 years and is 12 percent lower if using an index fund.
Fact 2.6.6. Using a cost and tax efficient wrapper for an investment amounts to an
annual return gain of 1.45% compared to an investment fund.
96
Given the zero-sum game of active investment, see the next Section, and that only
0.6% of 2,076 actively managed US open-end, domestic equity mutual funds, see Section
4.3.4.3, and the possibility wrap many investment ideas in cheap index funds or ETFs,
it becomes clear why practitioners and academics suggest that the control of frictions
(tax, inflation, fees) is more important for investors than to focus on active portfolio
management.
2.6.4
A First Step toward Passive versus Active Investment
Let m , p , a be the expected returns of the fully diversified market portfolio, a passive
portfolio, and an active investment, respectively. We assume that the fraction of investors is passively invested and 1 is invested in active vehicles. Active management
is defined by the pursuit of transactions with the objective of profiting from competitive
information. Usually, active management is performed against a benchmark. By definition, a passive investor is one who is not active. Passive management means following an
index, benchmark or another portfolio using quantitative techniques. Since any investor
is either an active or passive one and since the market return follows from the aggregate
return of the active and passive investors, we have:
m = p + (1 )a .
(2.50)
But the return of the passive investment equals that of the market. Equation (2.50) then
implies that the active return must also be equal to the market return and hence to the
passive investments independent of the fraction . Therefore, without any probabilistic
or behavioral assumptions, it follows that before costs the three investments pay back
the same return:
Proposition 2.6.7 (Sharpe). Before costs, the return on the average actively managed
dollar will equal the return on the average passively managed dollar.
Because active and passive returns are equal before cost, and because active managers
bear greater costs, the after-cost - return from active management must be lower than
that from passive management.
Proposition 2.6.8 (Sharpe). After costs, the return on the average actively managed
dollar will be less than the return on the average passively managed dollar.
These statements are strong and they need to be considered with care. The derivation is trivial because the assumptions that lead to (2.50) trivialize the problem. The
first assumption is that a market - the value-weighted portfolio of all traded assets in the
economy - must be chosen and by definition a passive investor always holds the market
portfolio. Suppose that all investors were passive: how is it possible that all then hold
the market portfolio, in other words who is on the other side of the trades? Second,
the result concerns average active managers and not single investors. The dimension
that active managers can be more or less skillful is not considered at all. Nor does the
2.7. FOUNDATIONS OF INVESTMENT DECISIONS
97
analysis differentiate between skill and luck. We will address these questions in Chapter 4.
From an information processing point of view, active management is forecasting.
There are different types of forecast quality. The naive forecast is the consensus expected
return. This is the informationless forecast and if it can be implemented efficiently, the
expected returns of the market or the benchmark follow. There are so-called raw and
refined forecasts (Grinold and Kahn [2000]). Raw forecasts are based corporate earnings
estimates or buy and sell recommendations. Refined forecasts are conditional expected
return forecasts based on the raw forecast information. We prove in the exercises the
following forecast formula for the excess return vector R and the raw forecast vector g
where the two vectors have a joint normal distribution:
E(R|g) = E(R) +
cov(R, g)
(g E(g)) .
var(g)
(2.51)
This equation relates forecasts that differ from their expected levels to forecasts of
returns that differ from their expected levels. The refined forecast is then defined as the
difference between E(R|g) and the naive forecast E(R). The forecast formula has the
same structure as the CAPM or any other single factor model. This is not a surprise but
follows from a linear regression analysis.
Example - From where does superior performance come?
The decisions in active management which promise superior performance compared
to a passive strategy include different approaches. Market timing means to alter the risk
exposure of the portfolio through time by combining market fluctuations together with
a macro analysis for example. Sectoral emphasis means to weighting a portfolio towards
or away from company attributes (called tilting) such as size, liquidity, leverage, yield or
boot-to-price ratio. Stock selection bets are based on idiosyncratic information. Finally,
large investors can achieve incremental rewards by accommodating hurried buyers and
sellers.
2.7
Foundations of Investment Decisions
The risk, return and diversification properties of assets of last sections were not the result
of any decisions by investors. We consider in this section investment which are based
on first economic principles how individuals make their investment decisions. There are
many different ways to make an investment decision. Two approaches based on rational
decision-making in a probabilistic setup (statistical models) are:
Optimal investment where people consume and invest; the asset-liability management approach using investment language.
Optimal investment where people only invest; the asset only approach.
98
In rational theories, in particular in expected utility theories, the investor uses the expected utility criterion as a rule of choice: The higher the expected value is for an
investment, the more is such an investment preferred. Like any mathematical model, expected utility theory is an abstraction and simplification of reality. There exists a large
academic literature which reports about systematic violations of empirical behavior of
investors compared to the expected utility theory predictions. A prominent alternative
theory is prospect theory by Kahneman and Tversky (1979). But most investment theories used in practice are still based on expected utility theory.
The theory assumes that investors form correctly beliefs and that they choose optimal actions or decisions. The beliefs define the probabilistic setup about the dynamics
of future returns. One action are is the optimal choice of the portfolio weights over time.
Both, the beliefs and actions can flawed. The optimal decision is based on the investors
preferences which are represented by his or her utility function. The optimization requires to maximize expected utility subject to constraints such as the budget constraint.
This representation of the decision problem in term of a mathematical optimization is a
main advantage of expected utility theory - optimization theory is a well-developed field
in mathematics and the approach is very general.
Investors often face situations in which non-calculable risk - uncertainty - is the
key component in their models. In this case, optimal investment theory is replaced by
heuristic reasoning, see Section 2.7.2.
2.7.1
Statistical Models
The AM industry often uses models where investors choose portfolios such that their expected utility of money over a given time period is maximized. The model of Markowitz,
the CAPM, arbitrage pricing theory (APT), Fama-French, general factor models and
Black-Litterman are examples.
This is a reduced form of the more general economic setup in which investors derive
utility from consumption and choose both optimal consumption and investment to optimize utility. But consumption is difficult to handle in investment; therefore investment
models often neglect consumption. This issue of neglecting consumption is delicate since
it reduces the general economic decision problem to an asset only case situation.
Preferences are described by a utility function u of consumption c or wealth W . Utility
increases u0 > 0 with increasing consumption (positive marginal utility) but marginal
utility decreases, u00 < 0. We always assume that the utility functions a continuously
differentiable. If we assume u(W ), all other things equal, this mathematical conditions
imply that investors:
Prefer more money to less;
Prefer to avoid risk;
99
Prefer money now to later if we assume that utility today is equal worth more than
the same utility level at a future date.
Maximizing expected utility under constraints has the following structure. There are
decision variables , such as consumption and portfolio weights. The goal is to find
the optimal decision variables such that the highest value of expected utility follows.
The other type of variables are the state variables , such as wealth. The future value
of the state variable wealth is partly driven by the decision variables - the more an
individual consumes the lower is future wealth. Expected utility is optimized under
different constraints. The most well-known being the budget constraint, which relates
the growth of wealth W in a period to the success of the chosen investment strategy
in the different securities, the amount of consumption c in the period, and possible labor
income in the period. Formally,
max EP [u()]
(2.52)
A()
with A() the admissible set. This set contains dynamics of assets, budget restrictions,
investment restriction, etc. Investors can deviate in many respects from the solution
of (2.52). First, they can use a different belief (probability) Q instead of the historical
one P about the future value of the variables. Second, they can fail to know precisely
their preference function u but instead work with an approximation. Third, they can not
take into consideration or not know the full admissible set. Fourth, they are not able or
do not intend to search for an optimal solution in the mathematical sense, that is, the
maximization is replaced by a heuristic argument. Fifth, the optimal actions cannot be
implemented since there are not enough liquid assets.
If one considers an explicit optimization problem and its solution, then it is immediate
that most individuals are not able to solve such a problem. But this does not necessary
mean that they do not behave in a way which is consistent to the optimal solution economists use to say that individuals behave as if they were solving the optimization
problem. In other words, in periods were it was not possible to detect how people do
their decisions, the observation outcome were compared to the theoretical predictions
without considering how the individuals made their decisions. With the possibilities of
neuro-economic science the concept of as if is replaced step-by-step by the true decision
making.
Investors often face a long-term investment horizon and they are allowed to change
their portfolio decision. This defines a dynamic expected utility problem.5 This means,
that the asset-only investor searches a portfolio t at different dates such that the expected present value of the investment is maximized. To solve such an investment problem optimally, one has to determine the last investment decision before the investment
5
RT
Formally, maxA() EP [u()] is replaced by a problem maxs A(s ) EP [ es u(, s)ds] with T any
0
future date and the time discounting factor rate.
100
horizon, then the second to last and so on. This principle of backward induction is based
on the optimality principle of Bellman (1954). To solve such a problem requires extreme
strong analytical capacities. It is of no surprise that there is a huge academic literature
which reports about failures of humans to apply backward induction correctly. Repeating
say 10 times an optimal one-period model decision (forward solution concept) is not the
same than making optimal investment decisions backwards, except for some particular
situations. If we do not consider model risk, the utility an investor derives from backward
induction dominates the utility derived from the repeated static forward approach.
Example Backward versus forward induction

Consider the case where you have to drive from New York to Boston for the first
time. Using a repeated static model (forward induction) you decide at each crossroad
given the traffic situation at the crossroad which direction to follow next. Using this
strategy you will hardly ever arrive in Boston.
Solving a dynamic model optimally, you start with the end in mind: You work
backwards starting in Boston where you consider all possible paths between New York
and Boston. At each crossroad in this backward approach, you calculate whether it is
best to say turn left or right. This singles out between the myriad of paths between New
York and Boston the truly optimal one. It is only by chance, that this path is equal to
the repeated static approach.
Investments which consider multi-period often differ from repeated one-period investments. The static models fail for example to take changing investment opportunities
into consideration. But changing investment opportunities are a key aspect for long
term investors such as pension funds. Despite the meaningfulness of multi-period models, most investment models used are of the static or one-period type. Complexity to
understand and communicate dynamic investment strategies, lack of data or ambiguity
to take changing investment opportunities into account (model risk) are reasons for the
dominance of static models.
Example Utility of wealth and asset pricing equation
Consider a one-period decision problem where the investor derives utility u(W1 ) from
final wealth W1 . The investor chooses a portfolio Rn for n assets to maximize
E(u(W1 ))
Punder two constraints. First, the price of the portfolio at time 0 equals initial
wealth: j j Sj (0) = W0 with Sj (0) the price ofP
asset j at time 0. Second, final wealth
is equal to the portfolio value at time 1: W1 = j j Sj (1). To solve the problem one
introduces the Lagrange function and under some technical conditions, the first order
conditions (FOC) are necessary and sufficient for the existence of an optimal portfolio.
101
We assume that this is the case. The FOC means taking the derivative of the Lagrangian
and equating the derivative to zero. This implies our first asset pricing equation:
E(u0 (W1 )(Ri Rj )) = 0 .
(2.53)
The FOC condition has to hold for returns of arbitrary asset pairs i, j. This equation has
several implications. First, Ri Rj means investing one unit in asset i (long) and being
short one unit in asset j. This zero-cost portfolio is called excess return. To understand
(2.53), assume that it is not equal to zero but equal to a positive value. Then adding an
additional amount of the zero-cost portfolio would yield a portfolio even better than the
optimal portfolio which is impossible for an optimal portfolio. A similar remark applies
if the value is equal to a negative number. Second, one can choose for asset j the risk
free asset. Third, geometrically the condition states that the excess return vector and
marginal utility are orthogonal to each other, that is we introduce the inner product of
the square integrable random variable the asset pricing equation reads
hu0 (W1 ), Ri Rj i = 0 .
(2.54)
We recall that if X equals the space of square-integrable random variables over a probability space (, F, P ), then
Z
hf, gi :=
f (x)g(x)dP (x) = E[f g] , f, g X,
defines an inner product. Fourth, assume that the investor is risk averse which means that
marginal utility is not constant. Then it is never optimal for an investor to fully invest
in the risk free asset. To understand this, assume that the investor puts all his initial
wealth in the risk free asset. But then final wealth W1 will be non-random and hence
also u0 (W1 ) is a deterministic function which can be taken outside the expected value in
(2.53). But then unless all risky return are the same, the FOC cannot be satisfied.
2.7.1.1
Risk Preferences
Choosing the utility function defines the risk preferences. Consider an investor who is
given the choice between two scenarios - a guaranteed payoff and a bet. The bet has
the same expected value as the guaranteed payoff. A risk-neutral investor is indifferent
between the bet and the guaranteed payoff. She is risk-averse if she prefers the guaranteed payoff; otherwise she is risk-seeking. If an investor is risk-averse, adding USD 1 to
a wealth of USD 100 increases the utility more than adding the same dollar to USD 1, 000.
Assume that the payoff is either 50 or 100 with the same probability and that the
guaranteed payoff is 75. Figure 2.17 shows the payoff and utilities for the risk-averse
and the risk-neutral investor. For the latter, the three utilities are on a straight line.
102
Therefore, the probability weighted utilities in the bet - the expected value - give the
same utility value as for the guaranteed payoff. For the risk-averse investor, the expected
value of the bet lies also on a straight line but its utility value (yellow dot) is strictly
lower than the utility of the guaranteed payoff (red dot). Therefore, a risk-averse investor
needs an extra compensation for the difference red dot minus yellow dot such that he
or she becomes indifferent in the choice between the bet and the guaranteed payoff.
Figure 2.17: Risk-neutral and risk-averse investors.
2.7.1.2
Investment Restrictions
Investment restrictions are a source of complexity. They often for example destroy the
analytical tractability of the models. Some restrictions are:
Preference restrictions - limiting the fraction of capital invested in equities.
Legal restrictions - prohibiting access to some markets.
Taxation - different taxation for the same investment idea wrapped by different
securities such as mutual funds or structured products.
Budget restrictions.
Liquidity restrictions - large investors do not want to move asset prices when they
trade.
Transaction fee restrictions.
103
Practitioners like to impose constraints if the output of an investment optimization is not

in line with what they consider a reasonable strategy. If, say, an output in a diversified
portfolio is invest 80 percent in SMI, then this figure can be considered too large. A
constraint bounds the possible investment in SMI between say 20 percent and 40 percent.
But such interventions have an economic price, see example below. Furthermore, adding
many ad hoc constraints makes it difficult to explain to clients whether a portfolio is
optimal due to the utility function (the risk and return preferences of the investor) or
due to the constraints.
Example Unrestricted and restricted optimization
The optimal value of an unrestricted optimization problem is never lower than the
value of a restricted problem. Hence, each restriction has a price. Consider for the
minimization of the parabola u(x, y) = x2 + y 2 . The minimum is achieved for the vector
(0, 0) and the optimal value is then u0, 0) = 0. We now insert the restriction that
x + y = r > 0. This means that x and y have to be element of a line. The optimization
2
using the Lagrange function implies the optimal values x = y = 2r and f ( 2r , 2r ) = r2 which
is larger than the optimal unrestricted value. The Lagrange multiplier associated to
the constraint x + y = r has the value = r - this is the shadow price for adding this
constraint.
2.7.1.3
Mean-Variance Utility and Mean-surplus Optimization
The investor has mean-variance preferences - that is, the investors utility function is a
linear combination between the expected return and the variance (risk component) of a
portfolio. So,
E[u]
=
=
Expected Return - Risk Aversion x Risk

X
X
j k Cjk
j j
2
j
(2.55)
j,k
=: h, i h, Ci
2
where j is the expected return of the asset j, j is the fraction of wealth invested in
the asset j (the investment strategy). The sum of all s adds up to 1 if there is no
borrowing and if investors are fully invested. In general, the strategy can also assume
negative values (short selling). is the risk aversion of the investor. The factor 21 is only
inserted to remove a factor 2 in the derivation of the optimal investment strategy. C
is the covariance matrix, which measures the statistical dependence between all assets.
The goal is to find that maximizes (2.55). The analytic solution follows at once:
1
= C 1 .
(2.56)
104
Suppose that an investor has zero risk aversion. Then optimization is immediate: invest
all the capital in the asset with the highest expected return. If risk aversion increases, the
risk component becomes more and more important. Since the risk is always a positive
expression, the higher the risk component, the lower the optimal level of expected utility.
Formula (2.56) states that the optimal amount invested in each asset is given by a mix
of the expected returns of all assets:
E(u( )) =
1
h, C 1 i .
2
Consider two extreme mathematical cases for the matrix C in the two asset case for illustrative purposes only. Assume first that C is the unit matrix - the assets have the same
1
volatility and they are all not correlated to each other. Then C 1 = C and 2
, this
means since there is no risk structure investment is proportional to the expected returns.
There is no mixing as in the general case. Contrary, assume that C is the matrix with
zero in its diagonal and 1 in all cells. Then optimal investment in asset 1 is proportional
to the expected return of asset 2 and the same applies for the optimal investment of asset
2. Hence, if there is full dependence mixing becomes maximal. The matrix C 1 is the
information matrix.
So far, the model is of the asset only type. There are no liabilities. But for many
investors liabilities are important: for pension funds the liabilities of the insured employees and for private clients the objectives that they want to finance matter. The surplus
S is the difference between the value of the assets, A, and the value of the liability, L. If
we consider two dates 0 and 1, the surplus return RS relative to the liability equals
RS =
S1 S0
A0
=
RA RL .
L0
L0
The definition of the surplus return avoids possible division by zero. The objective is to
maximize the following mean-variance utility function:
E[u] = h, RS i S2
2
with S2 being the surplus variance.
How do investors take into account that investment is an asset and liability issue?
Research from State Street (2014), using data from a worldwide survey of 3, 744 investors,
shows that although nearly 80 percent of investors realize the importance of achieving
long-term goals but proficiency in achieving them can strongly deviate. In the US, public pension funds were on average less than 70 percent funded, with more than USD 1.3
trillion of unfunded liabilities. A similar picture holds for private investors. While 73
percent cited long-term goals only 12 percent could say with confidence that they were
on target to meet those goals.
105
Do we know the causes for this misalignment between what investors state (long-term
asset-liability management) and what they do (short-term asset only)? Many academic
papers address this question, discussing a myriad of possible reasons. One reason are
emotions. While investors are exposed to emotions at present, the far distant future such
as the retirement date has hardly any emotional impact on how young people consume
and invest today. Given that it is even difficult to have transparency about the impact
of future wealth given todays decision, it is plausible that investors face strong forces
towards myopic behavior. Is it possible that the digital revolution in banking and asset
management helps investors to consider their long-term liabilities more coherently in investment decision-making?
Another reason in asset management is career risk of the asset managers. Consider a
family office which considers long-term goals and which mandates an asset manager. It
would be consistent if the asset manager also adopts a long-term investment goal. But
the manager has also to take care about short-term performance else she or he faces the
risk to lose the mandate. A rule of thumb is that loyalty of investors is maximally lasting
for three years. After three years of underperformance an asset manager faces money
outflows from his mandates.
2.7.1.4
Benchmarking
Investment decisions are often made relative to a reference point - the investment opportunity. It is common in asset management to select benchmarks as investment opportunities. The goal of active asset management is to outperform these benchmarks such
that the outperformance cannot be attributed to pure luck of the asset manager but to
his skills. Hence, benchmarks are used to measure the success of active management
relative to the benchmark. The insertion of a benchmark variable into the statistical
model mostly causes no theoretical difficulties. If b is the benchmark, utility is often of
the form u(A, b) = u(A b). Active management often has both a passive component,
which represents long-term goals in a benchmark portfolio, and an active portfolio in the
short or medium term, which represents views or opportunities. Active management is
defined by deviation from the benchmark in order to benefit from market opportunities.
The passive portfolio, which is assumed to be the optimal long-term investment, then
stabilizes the whole investment.
Definition 2.7.1. A passive investment strategy tracks a market-weighted index or portfolio (the benchmark). The goal of an active investment strategy is to beat the marketweighted index or portfolio by picking assets (stock selection) at the right time (market
timing).
ETFs, trackers and index funds are examples of passive strategies. Mutual funds,
opportunistic use of derivatives, and hedge funds are examples of active strategies.
106
Example - Relative versus absolute return

We consider some differences between relative and absolute investment. There are
two dates and the investment amount in an asset is 100 in a currency. The asset can
take one of three values at a future date - 90, 100, or 110. A benchmark asset also has an
initial price of 100 and can take the values 80, 90, 100, 110, or 120. Table 2.11 compares
the absolute returns with the relative returns. The absolute returns are independent of
the realization of the benchmark. The data show that relative performance can turn a
bad absolute return into a positive one and a good one into a bad relative one.
Asset realization
100
Rel. Abs.
Rel.
+12.5%
0% +25%
0%
0% +11%
-10%
0%
0%
-18%
0%
-9%
-25%
0%
-17%
90
Benchmark
80
90
100
110
120
Abs.
-10%
-10%
-10%
-10%
-10%
110
Abs.
Rel.
+10% +38%
+10% +22%
+10% +10%
+10%
0%
+10%
-8%
Table 2.11: Relative versus absolute returns.

We provide a deeper discussion using optimization in Section 2.8.5.
2.7.2
Heuristic Models
The heuristic approach is radically different from the statistical one. Heuristics are
method used to solve problems using rules of thumb, practical methods, or experience.
Heuristics need not be optimal in a statistical modelling sense. Heuristics could be seen
as a poor mans concept compared to statistical models. But there are reasons why
heuristic approaches are meaningful. Most outputs of statistical models possess some
weaknesses. The Markowitz model for example provides the investor with an investment
strategy that is too sensitive - that is to say, small variations in data input parameters
lead to large changes in the optimal portfolio output. Heuristic thinking is, then, often
imposed on these models to obtain acceptable solutions. A heuristic in the Markowitz
model is to constrain the optimal investment strategies.
A second reason for the use of heuristics arises if one distinguishes between risk and
uncertainty. These are two different concepts, which lead to different behaviors. It is
impossible to transform uncertainty-related issues into risk-related ones and vice versa.
According to Knight (1921), risk refers to situations of perfect knowledge about the
probabilities of all outcomes for all alternatives. This makes it possible to calculate optimal choices. Uncertainty, on the other hand, refers to situations in which the probability
107
distributions are unknown or unknowable - that is to say, risk cannot be calculated at all.
Decision-making under conditions of uncertainty is what our brain does most of the
time. Situations of known risk are relatively rare. Savage (1954) argues that applying standard statistical theory to decisions in large, uncertain worlds would be utterly
ridiculous because there is no way of knowing all the alternatives, consequences, and
probabilities. Therefore, the brain needs strategies beyond standard statistical rules in
an uncertain environment. Using best solutions in a world of risk in a world with uncertainty is sub optimal and it is flawed by model risk. Statistical thinking is sufficient for
making good decisions if the problem is computationally tractable. To understand when
people use statistical models in decision-making and when they prefer heuristics requires
the study of how the human brain functions. Camerer et al. [2005] and Plicher and Fehr
[2013] are just two of the sources that can introduce the interested reader to this topic.
Example - Uncertainty examples

Ellsberg (1961) invented the following experiment to reveal the distinction between
risk and uncertainty, where today one often uses the expression ambiguity instead of
uncertainty. An individual considers the draw of a ball from one of two urns:
Urn A has 50 red and 50 black balls.
Urn B has 100 balls, with an unknown mix of red and black.
First, subjects are offered a choice between two bets:
USD 1 if the ball drawn from urn A is red and nothing if it is black.
USD 1 if the ball drawn from urn B is red and nothing if it is black.
In experimental implementations of this setting, the first bet is generally preferred over
the second by a majority of the subjects. Therefore, if the agents have a prior on urn
B, the predicted probability of red in urn B must be strictly less than 0.5. Second, the
same subjects are offered a choice between the following two bets:
USD 1 if the ball drawn from urn A is black and nothing if it is red.
USD 1 if the ball drawn from urn B is black and nothing if it is red.
Again, the first bet is generally preferred in experiments. Therefore, the predicted probability of black balls in urn B must be less than 0.5. This probability assessment is
inconsistent since a unique prior cannot simultaneously assign to the event red from
urn B a probability that is strictly less and also strictly greater than 0.5. Ellsbergs interpretation was that individuals are averse to the ambiguity regarding the odds for the
ambiguous urn B. They therefore prefer to bet on events with known odds. Consequently
they rank bets on the unambiguous urn, A , higher than the risk-equivalent bets on B.
108
Example - Uncertainty in macroeconomics

Caballero (2010) and Caballero and Krishnamurth (2008) consider the behavior of
investors in the following flight-to-quality episodes:
1970 - Default by Penn Central Railroads prime-rated commercial paper caught the
market by surprise.
1987 - Speed of the stock markets decline led investors to question their models.
1998 - Co-movement of Russian, Brazilian, and US bond spreads surprised almost
all market participants.
2008 - Default on commercial paper by Lehman Brothers created tremendous uncertainty. The Lehman bankruptcy also caused profound disruption in the markets
for credit default swaps and interbank loans.
They find that investors were re-evaluating their models, used conservative behavior or even disengaged from risky activities. These reactions cannot be addressed by
increasing risk aversion about macroeconomic phenomena. The reaction of investors in
an uncertain environment is fundamentally different from a risky situation with a known
situation and environment.
Example - Greece and the EU

In spring 2015 uncertainty about the future of Greece in the EU increased. Four
different scenarios were considered:
Status quo. Greece and the EU institutions agree on a new reform agenda such
that Greece receives the remaining financial support of EUR 7.2 billion from the
second bailout package.
Temporary introduction of a currency parallel to the euro. If the negotiations under
A are taking longer than Greek liquidity can last, Greece will introduce a parallel
currency to fulfill domestic payment liabilities.
Default with subsequent agreement between the EU and Greece. There is no agreement under A. Greece fails to repay loans and there will be a bank run in Greece.
The ECB takes measures to protect the European banking sector.
Grexit - that is, Greece leaves the eurozone. Greece stops all payments and the
ECB abandons its emergency liquidity assistance. Similar conclusions hold for the
Greek banking sector as under C. Greece needs to create a new currency since the
country cannot print euros.
109
The evaluation of the four alternatives is related to uncertainty and not to risk: the probability of each scenario is not known, there are no historical data with which to estimate
the probabilities, and the scenarios have dependencies but they are of a fundamental
cause-effect type, which cannot be captured by the statistical correlation measure. This
shows that valuable management is related to situations which are based on uncertainty.
2.7.2.1
Parameter Uncertainty in Investment
Uncertainty which is related to state uncertainty - we dont know the possible future
states - or the impossibility to evaluate alternatives due to lack of statistical data is different from parameter uncertainty in risk models such as the Markowitz mean-variance
approach. The mean and covariance are unknown in this model. One has to estimate
these parameters from a finite data set. Different statistical approaches exist to estimate
the parameters. Whichever approach we choose, there is risk that the estimated parameters are different from the unknown, true parameter values. This defines estimation
risk or parameter uncertainty. The traditional approach was to assume that the investor
knows the true parameter values. But in reality one has to define a procedure outside
of the optimization program leading to the optimal investment strategies which fixes the
values of the parameters.
There are many different statistical methods used to find the optimal parameter
values, see also Section 4.10 for a big data approach. The traditional approach is to
estimate the mean
and the covariance C from the data, to plug the values into the
optimal portfolio rule (2.56):
1
M V = C 1
,
and to assume that the plugged-in parameters are the true ones. There is no estimation risk. But acting as if there is no estimation risk is a non-optimal decision. Several
authors have empirically documented or theoretically proven that rules which consider
estimation risk dominate uniformly the plug-in approach.6
We first introduce some notation. We write for the expected mean variance utility
function, see (2.55):
U () := E[u] = h, i h, Ci =: p p2 ,
2
2
(2.57)
the portfolio rule based on the estimation of historical data DT in T periods

= f (DT )
6
(2.58)
Tu and Zhou (2003), Kan and Zhou (2011), Zellnter and Chetty(1965), Pastor and Stambaugh (2000)
is original work.
110
with f a statistical function. is a random function and therefore, the out-of-sample

variance and mean are also random:
C i
,
i .
p2 = h,
p = h,
The random out-of-sample objective function, which is comparable with the meanvariance utility function, takes the form
:= h,
i h,
C i
.
()
U
2
(2.59)
is called the loss function and its expected value is

The random difference L = U U
the risk function. We always assume that all risky asset returns Rt are IID normally
distributed and that the length of the historical time series T is large enough compared
to the number of risky assets N such that estimated matrices can be inverted.
P
We consider some plug-in examples. We estimate the sample mean by
= T1 Tt=1 Rt
and similarly for the covariance estimate. Both estimates are sufficient statistics for the
historical data: One only needs to consider these two estimates in the portfolio rules. The
estimates
, C are the maximum likelihood estimators of the model mean and covariance.
Using this two estimates as plug-in values in (2.58), the rule follows. This rule is the
most efficient estimator of M V . But this estimator is not optimal if we want to optimize
the expected out-of-sample performance. The specific assumptions allow us to compare
explicitly the estimated strategy with the optimal but unknown M V in (2.56):
=
E()
T
M V .
T N 2
If T > N + 2, then the investor using the estimated values will take riskier positions than
the investor which (unrealistically) knows the true parameters (M V ).
Three variants of the above discussion are that either one or both of the parameters
, C are known and respectively unknown and therefore estimated. Kan and Zhou (2011)
provide a detailed analysis.
All plug-in methods were of a two step nature: First, the model parameters are estimated. In a second step, the optimal portfolio weights are calculated by assuming
that the estimated parameters are the true ones - there is no parameter estimation risk.
Whether the estimation errors say for the returns are 1 percent or 20 percent does not
matter for the result.
The Bayesian approach considers estimation risk. This approach assumes that the
investor not only cares about the historical data in the estimation of the model parameters
but also about the prior distribution about the model parameters. The drawback is that
there is no optimal prescription how the prior distribution is found. Assuming normal
111
returns, the prior distribution - called diffuse prior P0 - about the mean and the covariance
matrix is distributed as (see Stambaugh (1997))
P0 (, C) |C|
N +1
2
(2.60)
Using this distribution, the posterior distribution P (, C|Data) conditional on the data
set available can be calculated. It is a t-distribution with T N degrees or freedom. The
optimal Bayesian investment rule becomes
1
T N 2
Bay = f (T, N ) C 1
= f (T, N )M V , f (T, N ) :=
T +1
(2.61)
The Baysian investor therefore holds the same proportion C 1 portfolio as the MV
investor who does not consider estimation risk but there is a constant f (T, N ) which
shapes the portfolio investment uniformly for all assets. Since f is for any reasonable
problem smaller than one, the investment in the risky assets is smaller in the Bayesian
approach - estimation risk is identified and priced - than in the MV optimal plug-in case
without estimation risk. This follows from
E(Bay ) =
T
M V .
T +1
The difference between the Bayesian and the optimal investment M V is small in case of
a diffusive prior. If T becomes arbitrary large, the two portfolios coincide: For arbitrary
long time series of data the model has learned the true model parameters. To obtain
a more sensitive difference between the Bayesian and the optimal approach other priors,
so called informative priors, than diffusive ones have to be considered, see Pastor and
Stambaugh (2000).
Keeping the diffusive prior, Kan and Zhou (2011) show that the Baysian approach
leads to a better out-of-sample performance than the above considered plug-in approaches.
They show that in the normal distribution case:
Bayesian portfolio rule based on an diffusive prior uniformly dominate the classic
plug-in approaches;
There exists a two-fund portfolio rule
out = f M V
(2.62)
which dominates the Bayesian rule based on the diffusive prior uniformly (and hence
the plug-in rules) and where f maximizes the expected out-of-sample performance;
There exists a three-fund separation portfolio rule which shows in simulation experiments higher expected out-of-sample performance than the former methods.
112
The reason for the second statements is due to the fact that the function f (T, N ) in
(2.61) is not optimal in the sense that it does not maximizes the expected out-of-sample
performance. Optimizing f , however, is still mediocre if the time series used are not
unrealistically long, say 20 years data time series or longer. If this is not the case, the
optimized expected out-of-sample approach still leads to negative performances.
How can we encompass negative performances? If the investor knows the true parameter values of the model, then the mutual fund theorem applies: Any optimal portfolio
can be written as a combination of any two other optimal portfolios, see Proposition
2.8.5. But the above discussion showed that investing optimally in two funds generates
negative expected out-of-sample returns. The idea of Kan and Zhou (2007, 2011) is to
add a third fund which is not optimal but which can, if properly chosen, hedge against
parameter uncertainty risk and therefore leading to positive expected out-of-sample returns. The authors define a general portfolio rule which is a linear combination of two
risky optimal portfolios. The weights of the two portfolios are chosen such that the
expected out-of-sample performance is maximized. The price of this approach is that
two additional parameters in the model need therefore to be estimated. The shrinkage
approach of Jorion (1991) is a particular three-fund rule, see Kan and Zhou (2011).
Kan and Zhou (2011) compared the expected out-of-sample performance for different
time windows T of the historical data for 13 portfolios. They assume a relative risk
aversion in the optimal portfolio rule (2.56) of = 3. The asset space is given by
the N = 10 largest stocks in the NYSE from Jan 1926 to Dec 2003. The mean and
covariance matrix are estimated from the monthly returns of this times series and the
excess return of the 10 assets is assumed to be generated from a multivariate standard
normal distribution with the estimated mean and covariance as parameter values. They
report results for the following strategies:
I: Theoretical optimal, i.e. the investor knows the true and C.
II: Investor knows the squared Sharpe ratio 0 C 1 of the tangency portfolio but
not the two components of the Sharpe ratio. The investors theoretically invests an
optimal amount in the ex-ante optimal tangency portfolio.
III: Theoretical three-fund portfolio.
IV: Plug-in portfolio.
and C are plugged in resulting from the maximum likelihood method.
V: Bayesian portfolio rule.
VI: Rule II where the theoretical squared Sharpe ratio is replaced by its estimated
value.
VII: Jorions shrinkage rule.
113
VIII: Estimated three-fund portfolio, i.e. III where the theoretical values are replaced by their estimates.
Table 2.12 summarizes the results The table shows that in order to obtain positive exRule
I
II
III
IV
V
VI
VII
VIII
T = 60
0.419
0.044
0.133
-5.122
-2.996
-0.185
-0.899
-0.343
T = 180
0.419
0.122
0.191
-0.748
-0.584
0.060
-0.030
0.051
T = 300
0.419
0.171
0.224
-0.225
-0.170
0.133
0.117
0.143
T = 420
0.419
0.210
0.248
-0.025
0.002
0.177
0.182
0.189
Table 2.12: Out-of-sample performance for 8 portfolio rules with 10 risky assets (Kan
and Zhou [2011]).
pected out-of-sample performance for any rule very long time series are needed. T = 420
months means 35 years of data. In other words, to overcome parameter uncertainty risk
the rules or models need long time series to learn the true parameter values. The first
three rules are all leading to positive performance but unfortunately, they are theoretical
models. Replacing the unknown theoretical parameter values by their sample estimates
the positivity vanishes for short windows. The direct plug-in approach based on maximum likelihood estimates is the worst model w.r.t. out-of-sample performance. The
shrinkage rule and the three-fund rule lead for large windows to the same values but
for shorter time windows the superiority of the three-fund rule over the Jorions rule is
evident.
An interesting theoretical approach how one can clean the correlation matrix arise
from random matrix theory, see Bouchaud and Potters (2009) for a review.
Example
A heuristic model would be to choose the equal weights (EW) strategy with weights
of 1/N give equal weights to all N assets independent of the return, risk, and correlation
structure. As De Miguel et al. (2009) show for the Markowitz model and 12 extensions,
there are realistic situations for which 1/N outperforms mean-variance optimization.
Once again, one needs very long time series to reduce parameter estimation risk (uncertainty) to a level such that the optimal portfolio outputs based on the input parameters
outperform the 1/N portfolio. These results contradict the common view that heuristics is less successful than statistical optimization models. Researchers in this tradition
have evaluated peoples reliance on 1/N negatively and attributed it to their cognitive
114
limitations. In fact, ignoring part of the information - historical data for estimation of
model input parameters - needed for statistical model is what makes heuristics robust
for the unknown future. Some extensions of the classic Markowitz model explicitly take
these facts into account. The Black-Litterman model, for example, allows for both the
insertion of investment views on the asset returns and for the market statistics to estimate input parameters. This model is a mixture between a purely statistical, historical
data-based one and a forward-looking, expert model.
One out-of-sample performance relative to 1/N criterion across seven different
data sets of monthly returns is the Sharpe ratio. The authors apply a rolling-window
approach7 which generates a series of monthly out-of-sample returns. They find that
the 1/N strategy has higher Sharpe ratios than the tested models due to the estimation
risk in the models. To reduce estimation risk to a level such that the models Sharpe
ratios dominate the naive rules long windows of time series are needed. For 10 risky
assets, the Sharpe ratio of the sample-based mean-variance policy is higher than 1/N
only if the estimation period is around 6,000 months.
The Kan and Zhou (2007) model coupled to the 1/N rule - a variant of the above
described three-fund rule - performs as well or better than all other sophisticated investment policies on a consistent basis and it also substantially outperforms 1/N .
2.8
2.8.1
Portfolio Construction
Steps in Portfolio Construction
So far, we did not consider the logic of portfolio construction but used different portfolios
in examples on an ad hoc basis. Several steps define portfolio constructions:
Grouping of assets: But how do we select the parts (securities) of a portfolio?
Allocation of assets: How much wealth do we invest at each date in the specific
securities (weights)?
Implementation of the strategy: How do we transform the asset allocation into
trades?
The grouping of the assets or asset selection can be done on different levels:
Asset classes (AC)
Single assets
Risk factors
The allocation of the assets can follow different rules:
2.8. PORTFOLIO CONSTRUCTION
115
Optimal investment
Ad hoc rules
Heuristic rules
Risk parity / risk budgeting
Entropy based approaches
Big data based methods
The implementation of the asset allocation can be done using different tradable assets:
Cash products such as stocks and bonds
Derivatives such as futures, forwards and swaps
Options
Mutual funds, certificates, ETFs, money market funds
A fourth step is compliance of portfolios such as its suitability and appropriateness
for investors or the possibility to offer the portfolios cross-border. This section focus on
some grouping and asset allocation aspects of portfolio construction.
Without considering portfolios and without any model, every excess return
E(Ri ) Rf = i
(2.63)
is driven by its alpha - that is by the skills and luck of the investment manager. Then
many assets have to be considered on a stand-alone basis. This is as a chaotic and
very complex status of investment. The first major achievement in terms of reducing
this chaos was the Markowitz model. Working with the expected portfolio return and
the portfolio variance is a two-dimensional problem compared to the large number of
individual assets.
2.8.2
Static 60/40 Portfolio
A classic portfolio construction is the so called 60/40 portfolio. This means that after
each time period, the portfolio values in the two assets are rebalanced such the value of
equity is 60 percent of the actual wealth level and the fixed income government bond
investment has weight 40 percent. The two components equity and government bonds
are themselves equally weighted portfolios of stocks and bonds (dollar weighted). The
60/40 portfolio in the US has generated a 4 percent average annual return back to 1900.
116
Example Rebalancing
Consider a portfolio value V which consists of two asset S and B where at each date
the weight of the S-asset is 60% of the total portfolio value. If represents the number
of shares S in the portfolio and those of B, we have at time 0:
V0 = 0 S0 + 0 B0 = 0.6V0 + 0.4V0 .
To achieve the weights, the investor has to buy at time 0 0 = SV00 0.6 of asset S and
similarly, for asset B. After one time step the portfolio value before rebalancing reads:
V1 = 0 S1 + 0 B1 = 0.6V1 + 0.4V1
where a change in portfolio value is entirely due to changes in asset values and not
in changing the positions (self-financing investment strategy). Assuming that asset S
increases in value and asset B dropped where for simplicity V0 = V1 holds, then one has
to change the time 0 positions to restore the 60/40 weights. This means that 1 = SV11 0.6
and similarly, for asset B. This then leads to the portfolio value after rebalancing:
V1 = 1 S1 + 1 B1 = 0.6V1 + 0.4V1 .
It follows that the weight of the asset with a price increases is reduced and vice versa
for the other asset.
Generalizing the framework to multiple periods, the rebalanced strategy at time k
reads for asset S:
k
Q
(1 + RkV )
k = x
j=1
k
Q
,
(1 +
j=1
(2.64)
RkS )
where x is the fraction of wealth invested in S (60 percent), RkS is the one-period simple
return of asset S and RkV the one-period portfolio return. A similar result holds for
the other asset. It follows that from vista time prior to k, the rebalancing strategy is a
random variable. This formula shows that if the S asset returns lower than the B asset
ones and hence lower than the portfolio returns, more and more is invested in the S asset:
By rebalancing we implement an implicit buy-low-sell high mechanism. The S-part
of the portfolio value at time k can be written
k
Q
VkS = 0 Sk
(1 + RkV )
j=1
k
Q
.
(1 +
j=1
RkS )
(2.65)
117
What can be said about the performance and the rebalancing of the strategy if equities are booming (falling) in a given period or are moving sideward in an oscillatory
manner in a given period?
The 60/40 portfolio turns out to be not diversified enough when markets are distressed or booming. The dot-com bubble and the financial crisis of 2008 revealed that
different asset classes moved in the same direction and behaved as if they were all of
the same type, although capital diversification was maintained: Risk weights are not the
same as dollar weights. This indicates that different asset classes can be driven by the
same macroeconomic factors.
Deutsche Bank (2012) reports the following risk contributions using volatility risk
measurement for 60/40 portfolios where S&P 500 represents equity and US 10y government bonds the other part. The long-term risk contribution, 1956 to 2012, by asset
class was 79% for equities, and 21% for bonds. This is different from a 60/40 capital
diversification. The risk contribution to 60/40 portfolio for extreme market periods of
the US 10y government bonds were
1981 = 53%, 1996 = 43%, 2006 = 29%, 1963 = 3%, 1973 = 7%
- all different than the 40% dollar value. The left panel in Figure 2.18 illustrates the
strong positive correlation between equity and bonds: The performance and risk of traditional balanced portfolios is mostly driven by the equities quota. The equities quota
is like a (de-) leverage factor: The higher the equities quota, the higher the portfolio
exposure to the equity risk (slope of the straight line). The R2 is 95%, i.e. 95% of the
risk is explained by equity risk.
A first reason for this is that asset classes consist of a bundle of risk factors where
some factors can belong to several asset classes. If markets under stress trigger a common
risk factor, asset classes will move in the same direction. A second reason is that all classes
may fail with respect to some events like the systemic liquidity events that occurred
during the GFC: The monthly dollar returns between the classics and alternatives show
rather low correlation between 2000 and 2007 but increase sharply during the GFC and
remain elevated as the sovereign debt crisis follows in 2011. This failure of alternatives
to diversify during the GFC led to a heavy critique of the diversification concept based
on asset classes per se. In the middle panel in Figure 2.18 a balanced portfolio versus
equity Figure 2.18 but the balanced portfolio has a commodity and hedge fund part. The
addition of commodities and global Hedge Funds only slightly improves the allocation of
risk. Still 90%of the risk is explained by equity risk. The right panel shows that bonds
are not a relevant risk driver for balanced portfolios - the impact of bonds on the risk is
of minor importance.
Portfolio risk also depends on the time varying correlation and Figure 2.15 shows
that the correlation between stocks and bonds indeed varies over time. These variations
are typically due to several economic events. Historically, periods of rising inflation and
118
Figure 2.18: Left Panel: Monthly return equities world vs monthly return balanced portfolio (Equities world: 50%, bonds world: 50%), Bloomberg: 12/1998-3/2013. Middle
Panel: Monthly return equities world vs monthly return balanced portfolio (Equities
world: 40%, bonds world: 40%, commodities: 10%, hedge funds global: 10%) Commodities database: DJUBSTR, Hedge Funds database: HFRXG. Right Panel: Bloomberg
for equities and bonds: 12/1998-3/2013. Right Panel: Monthly return bonds world vs
monthly return balanced portfolio (Equities world: 50%, bonds world: 50%) Bloomberg:
12/1998-3/2013, local data.
heightened sovereign risk have driven stock and bond correlations sharply positive. In
contrast, correlations often turned sharply negative when inflation and sovereign risk
were at low levels.
If stocks and bonds can be described by their exposure to macroeconomic factors,
their correlations could be determined entirely through their relative exposures to the
same set of factors. Therefore, why not measure the exposures of stocks and bonds to
common factors and act according to the volatility and correlation forecast instead using
the static 60/40 rule? This would not be effective since, since the true factor structure is
unobservable, it is not always possible to invest in the economic factors, and investors
sentiment can impact the correlation structure, which makes the prediction of changing
correlation difficult.
Kaya et al. (2011) find that the economic factors growth and inflation have accounted
for only 2 percent of the total volatility of the 60/40 portfolio in the US since 1957, while
119
98 percent of the volatility of the portfolio has been the result of missing factors, misspecified factors, or risks that are specific to each asset class.
Summarizing, for the 60/40 asset allocation based on asset classes ...
... correlations between asset classes are time-varying, not stable and difficult to
forecast. This destroys diversification in times of market turbulence.
... fisk weights are not the same as dollar weights.
... we do not know if asset classes are the right level of risk aggregation.
2.8.3
2.8.3.1
Factor Models
Different Motivations
The failure of asset class-driven investment to diversify in turbulent times is one motivation for the search for investment methods based on alternatives to asset classes. One
searches for investable objects - risk factors - which are more basic objects than asset
classes for portfolio construction. Risk factors are random variables which influence the
value of assets. The state space of risk factors are the risk sources and by assumption,
risk factors are not divisible into smaller parts. Two different risk factors do not contain
the same risk sources. Asset classes are in this view bundles of risk factors. Different
asset classes can overlap in terms of their risk source which can lead to a collapse of
diversification.
Idiosyncratic risk is a further reason for a new concept: At the security level, there
is a lot of idiosyncratic risk or alpha.
Definition 2.8.1. Alpha is the return in excess of what would be expected from a diversified portfolio with the same systematic risk. The historical alpha is the difference
between the historical performance and what would have been earned with a diversified
market portfolio at the same level of systematic risk over that period.
When applied to portfolios, alpha is a description of extraordinary reward obtainable
through the portfolio strategy. In the context of active management: a better active
manager will have a more positive alpha at a given level of risk.
Adding more and more stocks to a portfolio reduces idiosyncratic risk or alpha. This
follows from Proposition 2.6.3. Therefore, alpha is not scalable. Is there a decomposition
of asset returns which is scalable? The so-called Professors Report on the Norwegian
GPFG (Ang et al. [2009]) states that the so-called risk factor decomposition represents
99.1 percent of the funds return variation.
Another reason for risk factors is to find a sufficient number of factors such that
the dimensionality of the covariance matrix can be replaced by a lower dimensional
matrix based on risk factors which accounts for the risk sources in a non-overlapping way.
120
Example
If there are N = 100 assets one needs for optimization models with a first and second
moment (Markowitz) N expected returns, N standard deviations and N (N 1)/2 correlations. This means 50 150 parameters need to be estimated. If the correlation between
any two assets is explained by systematic components - the factors - then one can restrict
the attention in estimation and modelling of returns to the much smaller number of nondiversifiable factors. Risk factors are from this perspective purely statistical concepts;
there is no theory supporting the approach but only past data.
Finally, empirical observations of some liquid trading strategies show on average persistent patterns in market data. There factors different from the market factor which can
explain the cross section of expected asset returns. This empirical finance approach
identifies tradeable factors empirically such as the Fama-French factors (value, growth,
size) or momentum factors. The factors capture firm characteristics such size, technical
indicators and valuation ratios derived from the balance sheets and income statements
or market parameters such as the stock volatility. The value characteristic for example
is defined as the excess returns to stocks that have low prices relative to their fundamental value. The characteristic is operationalized by considering different ratios such as
book-to-price, earnings-to-price, different values (book value) and firm specific economic
or cash flow variables such as sales, earnings, cash earnings, and other cash flows. The
empirical observation is that in the past a long/short portfolio based on value-grouping
generates on average stable market neutral factors returns if rules select periodically the
firm with high values (long) and those with low values (short). A different class of empirical risk factor models consider difference between realized or historical values and
market implied ones such trading strategies focussing on realized and implied volatility
of derivatives.
Summarizing, different views on risk factors so far are:
Risk unbundling view.
Scaling of idiosyncratic risk view.
Covariance matrix complexity reduction view.
Empirical trading strategies different than market risk factor view.
2.8.3.2
Data Patterns - Quality premium
We consider the quality of equity (EQ Quality), see Figure 2.19, for all stocks in the MSCI
Europe. One calculates on a monthly basis firm specific figures such as profitability, net
profit or degree of indebtedness. Given these figures, one calculates a quality figure (Qfigure) for all firms. To consider the sector structure, the Q-figure is normalized by using
the average sector Q-figure and the sector volatility. This defines the Q-score. Ranking
121
this scores one observes historically, that on average those firms with a high score led
to a larger return than those with a lower score. This is then the discovered empirical
characteristic or feature called EQ Quality. If one believes that this historical return
pattern will continue to hold in the future, then one could invest in a strategy based on
this observation. Large investment banks and asset managers offer the tradeable products
which transform the above empirical observation into a financial asset. EQ quality is
implemented as a long-short or a long only combination. The long-short implementation
removes directional risks: As long as the premium exists where firms with higher scores
provide a higher return (long position) than the firms with lower score (short position),
the trend of the whole market is irrelevant. There institutional investors which do not
want to invest in long-short vehicles. This choice is often related to bad experiences in
the past. But investing long only in a risk premia causes a lot of problems. First, market
neutrality is lost. Second, correlation between risk premia and between traditional asset
classes moves significantly away from a weak correlation structure. But a long-short
strategy is not free of risk, see momentum crash below. The producer offer the risk
premia products in a form of transparent indices where the investor can understand in
detail how exactly the different risk premia are constructed. Different wrappers are used
for risk premia investment - UCITS funds, ETFs or structured notes.
MSCI Europe
Monthly
calculated
company
figures
Quality Figure
(Q-Figure)
Quality
Figure
Ran
k
Stock
StockA
2.5
Stock
C
3.0
Stock F
2.95
Stock
A
2.93
Stock
Z
2.91
Stock
S
2.86
Stock
B
StockC
Stock
StockB
Normalization of Q-Figure
StockD
StockE
1.6
3.8
0.1
2.0
Final selction
such as liquidity,
borrowing costs
Q-Score
-3.0
20 % highest QScore
Long Position
Historical
return
20 % lowest
Q-Score
Short Position
Historical
return
ARP Strategy
Figure 2.19: Construction of the risk factor quality.
2.8.3.3
Data Patterns - Momentum Premium
The motivation for this factor is the idea of extrapolating past performance into the
future. This is often called buying the winners (long) and selling the losers (short), see
122
Figure 2.16.
Daniel and Moskowitz (2012) consider a time series from 1932 to 2011 using international equities from the US, the UK, Europe, and Japan; there are 27 commodities,
9 currencies, and 10 government bonds in their data set. They find that over the post
WWII period, through 2008, the long/short equity momentum strategy had an average
return of 16.5 percent per year, a negative correlation (beta) with the market of 0.125,
and an annualized Sharpe ratio of 0.82. They document that momentum is pervasive for
equities, currencies, commodities, and futures. They further report that the maximum
monthly momentum return was 26.1% and that the worst five monthly returns were
79%, 60%, 46%, 44%, and 42%.
Intuitively, the premium is positive if the winners return is larger than the loosers
one. Constructing a market neutral risk premium means to be long the winners and short
the losers. This can be a very risky strategy if momentum crashes at the turning points,
that is where the past winners will be future losers and vice versa. Then, if the signals
for sorting the new portfolios are not quick enough the investor will be long and short
exactly the wrong portfolios which induces the heavy losses. We consider two turning
points and the following momentum crashes.
In June 1932 the market bottomed.
In the period July-August 1932, the market rose by 82 percent. Over these two
months, losers outperformed winners by 206 percent.
In March 2009 the US equity market bottomed. In the following two months, the
market was up by 29 percent, while losers outperformed winners by 149 percent.
Firms in the loser portfolio had fallen by 90 percent or more (such as Citigroup,
Bank of America, Ford, GM). In contrast, the winner portfolio was composed of
defensive or countercyclical firms like AutoZone.
This indicates that in normal environments, market appears to underreact to public
information, resulting in consistent price momentum. However, in extreme market environments, the market prices of severe past losers embody a very high premium. When
market conditions ameliorate, these losers experience strong gains, resulting in a momentum crash - that is, sequences of large, negative returns are realized. Daniel and
Moskowitz (2012).
How is a momentum factor or strategy constructed? We follow Moskowitz et al.
(2012) and Dudler (2014), who extend the work of Moskowitz et al. (2012) to a riskadjusted framework. Moskowitz et al. (2012) were the first to construct momentum for
the cross-section and time series. Traditionally, the momentum literature focused on the
relative cross section performance of securities. Time series momentum focuses purely
on a securitys own past return.
The momentum strategy is characterized by two time periods:
123
Figure 2.20: Long-only momentum strategies. Left panel - momentum strategies 19472007 Right panel - momentum strategies during the GFC (Daniel and Moskowitz [2012]).
The look-back period determines the horizon of past returns that is used to form
trading signals.
The holding period determines the time interval over which the realized past returns
are used to determine futures positions.
At the end of each month, the portfolio returns are ranked. The winner and loser
portfolios are identified. The strategy then is to invest USD 1 in the winner portfolio and
short USD 1 in the loser portfolio. This long/short combination defines the momentum
strategy. This ranking of portfolios is repeated at the end of each subsequent month.
Figure 2.20 illustrates the above discussion. In the left panel, the momentum strategy
since 1947 is shown versus the risk-free investment and the market performance. The figure shows that the winners outperform the losers and the market performance. Roughly,
investing USD 1 in 1947 long in the winner portfolio and shorting USD 1 in the loser
portfolio delivers the return of the winner portfolio. The short portfolio of the losers has
a positive return in periods of market turmoil. This is shown in the right panel of the
figure. There, the loser portfolio dominates all other portfolios.
This basic strategy is altered in Dudler et al. (2014) by using risk-adjusted daily
returns instead of pure returns. This means that the calculated returns are normalized by
an exponentially weighted moving average (EWMA) five-day volatility of the log returns.
The rationale is based on the well-known volatility of returns clustering - that is, there
124
is auto-correlation of the volatility of returns. But then it is natural to assume that past
volatility measurements can be used to forecast future volatility. EWMA then states
that the present volatility is a weighted sum of the past volatility and the present return
volatility. Dividing the returns by the EWMA implies a risk-adjusted return. The sign
of this adjusted return is the trading signal that defines the direction of the trade. The
risk-adjusted momentum returns for any instrument are then defined as the sum of signs
of the above trading signals over the holding periods, again weighted with an EWMA
volatility measure. That is, the momentum strategy is proportional to inverse risk. This
is not an optimal strategy but a strategy from the so-called risk-parity approaches, see
below.
2.8.3.4
Questions
Figure 2.21 shows the return of investing $1 in 1956 until 2015 in a market factor, and
the factors size, value or momentum which triggers some questions.
Figure 2.21: Investment return of $1 in 1956-2014 in the market, market plus value,
market plus size and market plus momentum factor (Ken Frenchs website).
What is the theoretic foundation of risk factors used in practice? Is theory able
to derive the risk factors which persist in equilibrium and which are different from
125
the market risk factor?

How are risk factors identified and turned into tradeable strategies?
How can one test for the persistent performance of risk factors?
The factor strategies qualities and momentum or size and value (see below) are
very simple investment ideas. The strategies are known since more than 25 years
and at least the value and momentum factor seem to be part of a risk premium
which is persistent. How can it be that in the long run such simple ideas produce
much higher returns than the market? Why arent they arbitraged away? Who is
on the other side of the trades?
How can one discriminate between true or persistent risk factors and fantasies?
How can one test for the persistent performance of risk factors?
Factors can be abstract random variables, themselves portfolio returns or excess
returns, dollar-neutral returns and time-varying. How are the models defined?
We will provide answer to some questions below and in the synthesis Section 3.
2.8.3.5
Factor Investing - Industry Evolution
A key step for the industry about factor investing were the requirements published in the
Professors report (2009). The authors state that factors, i.e. random variables which
are different from the market risk premia but which can explain the cross section of asset
excess return, should:
have an intellectual foundation (rational or behavioral).
exhibit significant premiums which are expected to persist in the future.
be not correlated among themselves and to asset classes in good times and negatively correlated in bad times.
be implementable in liquid, tradeable instruments.
The notion of good and bad times is made precise in economic theory by the stochastic
discount factor (SDF), see Section 3.
The financial industry defines factor investing similar to the Professors report. We
state from Deutsche Bank [2015]:
Explainable - risk factors should have a strong basis for existence.
Persistent - there must be a rationale for the persistence of the risk factor.
Attractive risk/return - it is important for risk factors to have attractive return
characteristics in isolation.
126
Unique - in the portfolio framework it is important to find uncorrelated sources of
return; risk factors should exhibit low correlations to traditional market betas and
to other risk factors being considered for investment.
Accessible - risk factors must be accessible at a level of cost that is sufficiently low
to avoid the dilution of the return.
Fully transparent - strategies are fully systematic and work within well-defined rules.
Liquid - strategies are designed to allow cost-efficient entry and exit to investors
with no lock-ups.
Low cost - a well-defined systematic approach makes possible efficient transactions
costs.
Flexible access - strategies can be accessed in a variety of formats either funded
or unfunded as a portfolio overlay and in a variety of wrappers (OTC, structured
notes, UCITS funds, etc.).
Summarizing, factor investing means alternative strategies defined on liquid assets and
not the creation of new, illiquid asset classes. The documentation, transparency and
efficiency requirements are missing in the Professors report.
Transparency for example radically changed in the industry in the last decade. Some
years ago, an investment bank offering a momentum strategy basically was a black box
for the investor - he did not know how the strategy is defined in detail. Today, each
factor is constructed as an index which a comprehensive documentation about the index
mechanics, the risks and governance issues.
2.8.3.6
Theoretical Motivation for Factor Investing
The expression factor or risk factor arises in different theoretical situations.

Stochastic discount factor (SDF). A SDF is the basis object in absolute and relative
asset pricing. The theory used is general equilibrium asset pricing (see Chapter 3).
The CAPM is such a model. If a positive SDF exists, then all assets can be uniquely
priced in an economy and the expected return of any asset is proportional to the
beta of the asset with the SDF. The SDF is not specified in terms of any investment
strategies such as the Fama-French factors but is specified by the preferences of the
investors, see Section 3.
Beta pricing factor model. These models explicitly assume that there is a finite
number of factors (random variables) and the starting point is to assume that the
expected return can be represented in terms of covariances or betas of asset returns
with the factors.
127
Arbitrage pricing theory (APT). ATP is based on the assumption that a few major
macroeconomic factors influence security returns. The influence of this factors
cannot be diversified away and therefore, investors price these factors. Unlike the
SDF in absolute asset pricing where the equilibrium concept is used to price the
assets, APT relies on the no-arbitrage principle which requires much less structure.
APT is a covariance matrix approach - that is, one constructs statistical factors
by using factor analysis applied to the covariance matrix.
The three concepts are not independent. In fact, under some conditions, they can be
shown to be equivalent.
2.8.3.7
Formal Definition of Factors; Beta Pricing Model
A formal definition of risk factors is given by considering beta pricing models where
the expected returns are expressed in terms of betas of asset returns with some factors,
see Back (2010) for a detailed discussion.
Definition 2.8.2 (Beta Pricing Model). Let F = (F1 , . . . , Fm ) be a vector of random
variables, R0 a constant risk and a k-dimensional constant vector. There exists a
multi-factor beta pricing model with factors F if for each return R:
E(R) = R0 + 0
(2.66)
:= CF1 cov(F, R)
(2.67)
with
the vector of multiple regression betas of the return R on the factors F and CF the
covariance matrix of the factors.
is called the factor risk premium. If > 0, an investor is compensated for holding extra risk by a higher expected return when risk is measured with the beta w.r.t. F .
Geometrically the coefficient is the coefficient of an orthogonal projection of the return
R on the space generated by the factors F plus a constant. Hence, expected returns
are in factor models proportional to a best approximation (orthogonal projection) of the
return on the set of factors.
If the factors F are themselves returns then the factor risk premium becomes an
ordinary risk premium = E(F )R0 . To prove this consider a one-factor model F = R
in (2.66).
If there is a risk free asset, then R0 = Rf . Furthermore, one can always write a beta
pricing model in terms of covariances instead of betas and without loss of generality, one
can always take factors to have zero means, unit variances and be mutually uncorrelated
by using
F := CD1 (F E(F ))
(2.68)
128
with CD the Cholesky decomposition of CF .

This formal definition makes precise what a factor is but it leaves aside the whole
practically relevant issues in the financial industry such as liquidity, factor construction,
transparency and many other issues. The advantage of such a formal definition is that
we know what we are talking about. Such a standard language is missing in the industry
where - risk premia, smart beta, alternative risk premia, risk factor, factor investing are used and where it is often difficult to single out what is really meant.
2.8.3.8
The CAPM as a Beta Pricing Model
The capital asset pricing model (CAPM) provides us with one-factor beta pricing model.
We start with linear regression for the asset returns. Consider for a stock i with return
Rt,i , Rt,f the risk-free rate, and Rt,M the return of a broad market index the linear
regression
Rt,i Rt,f = i + i,M (Rt,M Rt,f ) + t
(2.69)
where is the intercept, i,M the slope or regression coefficient, and t the standard
normal error term. The stocks excess return is the dependent variable and the market
excess return the independent variable. The slope indicates the unit changes in stock
excess return for every unit change in market excess return. The intercept indicates the
performance of the stock that is not related to the market and that a portfolio manager
attributes to her skills.
How accurate is the linear-regression model as an estimator for the dependent variable? To answer this question, one measures the fraction of total variation in the dependent variable that can be explained by the variation in the independent variable. That is,
total variation equals explained variation plus unexplained variation. For a linear regression with one independent variable, the fraction of explained variation as a percentage of
total variation follows by squaring the correlation coefficient between the dependent and
independent variables; this is known as R2 . An R2 of 60 states that 60 percent of the
changes in the return of a specific stock result from the market return while 40 percent
of the changes are unexplained by the market.
Regression Coefficients For both regression coefficients and confidence interval can be determined using the estimated parameter value, the standard error of the
estimate (SEE), the significance level for the t-distribution and the degrees of freedom.
The formula for the confidence interval reads for example tc SEE, where is
the estimated value and tc the critical t-value at the chosen significance level.
Example
129
Consider as an example the linear regression between an European equity funds

returns (dependent variable) and the EUROSTOXX 50 index (independent variable).
Statistical analysis implies for 20 observation dates the estimates = 1.18, SEE =
0.147 and 18 = 20 2 degrees of freedom. The Students t-distribution at the 0.05
significance level with 18 degrees of freedom is 2.101. This implies the confidence interval
1.18(0.147) (2.101) = 0.87, 1.49. Hence, in there is only a 5 percent chance that is
either less 0.87 or greater than 1.49.
We relate this empirical approach to the unconditional equilibrium asset pricing
model CAPM, see Section 2.9.1 for details.
The CAPM states that within the model the following cross-section relation has to
hold (deleting time indices):
E(Ri ) Rf = i,M (E(RM ) Rf ) =: i,M F .
(2.70)
Hence, the CAPM is a beta pricing model. The risk premium of the asset i is E(Ri ) Rf
and the market portfolio risk factor is F = E(RM ) Rf . Equation (2.70) is a crosssectional equation where the beta is defined by the time series (2.69). The CAPM states
that some assets have higher average returns than other ones but it is not about predicting
returns. An asset has a higher expected return because of a large beta and not the other
way around.
Furthermore, theory implies that the beta asset i, is given as the ratio of the covariance between asset i and the market portfolio divided by the market portfolio variance:
i,M =
cov(Ri , RM )
.
2 (RM )
(2.71)
Beta represents the assets systematic risk as discussed in Proposition 2. This risk cannot
be diversified away by increasing the number of assets. Taking this risk, the investor is
compensated by the market risk premia in (2.70). A large beta of a stock means that the
stock is risky and it has to pay a high expected return to attract an investor. The risk
premium depends on the assets and market returns covariance but not on the volatility
of the asset.
Summarizing, the time series regression (2.69) defines the which enters the CAPM
model (2.70) which predicts that the alpha should be zero. The time series is useful to
understand variation over time in a given return while the CAPM is used to understand
average returns in relation to betas. Formally for an arbitrary factor F :
Time Series : Re,t,i = i + i Ft + t i Model : E(Re,i ) = i + i .
|{z}
(2.72)
=0
The CAPM triggered an enormous econometric literature that addresses the verification of (2.70). Although Black, already in 1972, verified that the risk premia are not
130
proportional to their beta, it took many more years and much more academic writing
for a majority of researchers to accept the non-empirical evidence of (2.70). Why did it
turn out to be so difficult to test (2.70)? The answer is the intricate empirical test of the
CAPM - a joint hypothesis test is needed, see the EMH discussion.
The following example summarizes some facts and pitfalls about the CAPM.
Example Facts and pitfalls
The expected excess returns vary over time within an asset class and varies also
across assets at any point in time. Stocks have for example paid on average 6% more
return, with large temporal variations, than bonds for 150 years.
The time series regression (2.69) is not a forecasting regression. It can help investors
to understand the variation over time in a specific return which in turn can motivate
the search for hedges to reduce the variance over time. Cross sectional regressions
are also not about forecasting returns. They tell whether the long run average return
corresponds to more risk.
Suppose that the R2 is large in the cross-sectional CAPM equation (2.70). The
CAPM then explains the cross-section of average returns successfully and the alpha in
cross-section is small. But this can be the case even if the R2 of the time series regression
(2.69) is low (little success in explaining the time series returns). The main goal of the
CAPM is to see whether the s are low in the time series regression such that high
average returns in the cross-section are associated with high values of the factors - are
average returns high if their betas are high? The goal is not to test whether the time
series does well - the R2 of the time series regression is not the object of interest.
2.8.3.9
Risk Factors and Empirical Finance Model Evolution
Cochrane (2011) explains the evolution of the empirical finance approach to factor investing by assuming that research is to uncover the unspecified alpha over time. The
CAPM replaced the chaos of
E(Ri ) Rf =
(2.73)
with the corresponding equation where there was no unexplained alpha left. The expected
return of assets or portfolios should line up as a function of mean returns only. But chaos
or alpha was back; so
E(Ri ) Rf = + i,M (E(RM ) Rf )
with a non-zero alpha followed. Consider portfolios which are sorted by Book-to-Market
(B/M) Ratio or by Size and B/M ratio, see Figure 2.22.
131
Expected returns
CAPM line
Mean returns
Figure 2.22: Violation of the CAPM by B/M and size sorted portfolios. The dots represent the 25 size and B/M sorted portfolios. FF sort stocks in into five market cap
and five book-to-market equity (B/M) groups at a specific date. The sorting algorithm
ranks the assets into five groups using the percentiles for B/M values. A similar 5-sort
approach applies to size. This then leads to 25 value-weight size-B/M portfolios.
The figure shows that the sorted portfolios scatter around the CAPM line and that
if one interpolates between the sorted portfolios a line follows which is too flat compared
to the CAPM line. This motivated Fama and French (FF) in 1992 to add two additional
risk factors to the market risk factor in the CAPM: the risk factor value (HML) and
the risk factor size (SMB). They measured the historic excess returns of small caps and
value stocks over the market as a whole. FF argued that the CAPM worked well for size,
industry, beta-sorted portfolio and others but failed to do so for value sorted portfolios.
Stocks with low B/M should provide high average returns and high betas. This joint
behavior is key in beta pricing models. A low or high expected return per se is never
a puzzle but it becomes one if the beta do not match the return observations. But FF
observed that beta are not small for high expected return. They even have the wrong
sign - betas are lower for higher return securities. This observation led FF to introduce
the two new factors both in the time series regression and to consider the cross-sectional
132
implications. FF proposed
E(Ri ) Rf = i,M (E(RM ) Rf ) + i,SM B E(RSM B ) + i,HM L E(RHM L )
(2.74)
together with the corresponding time series for the excess return as in the CAPM.
Continuing with the evolution of models, some years later alpha was back again, that
is alpha is added to the Fama-French equation (2.74) Then, in 1997, Carhart added a
further factor: momentum (WML) - that is to say, i,W M L E(RW M L ) is added to the
Fama-French equation (2.74).
Then the story continued with alpha back again, etc. and it has not ended yet. How
is this evolution of risk factor representation related to the investors preferences? The
fundamental equation of asset pricing in Chapter 3 will provide an answer.
2.8.3.10
Risk Factors: Facts and Fantasies
Despite the attractiveness of factor investing one has to carefully distinguish between
facts, fantasies, and marketing. If one introduces for each asset characteristic a new risk
factor, the whole concept of factor analysis becomes an ad hoc procedure.
Harvey et al. (2015), which test the suitability of risk factors (see Chapter 3)), catalogues 316 risk factors. This is, as Cochrane (2011) refers to, indeed a zoo of new factors.
The investor may be lost when faced with this proliferation of factors. Can we identify
true risk factors and distinguish them from mere distortions such as data snooping, nonpersistent anomalies and stylized facts (simplified presentations of empirical findings)?
How many of the factors are indeed persistent? What is an appropriate statistical testing
approach to test for statistical significance of factors?
To answer the last question, one needs to apply different methods than the standard
statistical t-ratio greater than 2 tests which are insufficiently robust to distinguish between significant and not significant factors. Harvey et al. (2015) states:
Hundreds of papers and hundreds of factors attempt to explain the cross-section of
expected returns. Given this extensive data mining, it does not make any economic or
statistical sense to use the usual significance criteria for a newly discovered factor, e.g.,
a t-ratio greater than 2. [...] Echoing a recent disturbing conclusion in the medical literature, we argue that most claimed research findings in financial economics are likely false.
We consider this problem of false discoveries in Section 3.5.2.
Investors interested to put their money in risk factors face a potpourri of facts and
fantasies. Cazalat and Roncalli (2014) state the following facts:
Common risk factors explain more variance than idiosyncratic risks in diversified
portfolios.
133
Some risk factors are more relevant than others.

Risk premia are time-varying and low-frequency mean-reverting. The length of a
cycle is between 3 and 10 years.
The explanatory power of risk factors other than the market risk factor has declined
over the last few years because beta has been back since 2003.
Long-only and long/short risk factors do not exhibit the same behavior.
Risk factors are local, not global: The value factors in US and Japan cannot be
compared.
Factor investing has been widely used by asset managers and hedge fund managers
for a long time.
They also state some fantasies:
There are many rewarded risk factors.
Risk factors are not dependent on size. Some risk factors present a size bias, like
the HML risk factor.
Value is much more rewarded and riskier than momentum.
Long-only risk factors are more risky than long/short risk factors.
Strategic asset allocation with risk factors is easier than strategic asset allocation
with asset classes.
2.8.3.11
Risk Factor Allocation
Several aspects determine how risk factors should be allocated. First, by construction
risk factors should show weak correlations in normal and stressed markets. This strongly
suggests that any short-term discretionary interventions should be excluded. Furthermore, any rebalancing of the portfolio weights should be considered within a time period
where short-term fluctuations are no longer influential. Typically, rebalancings take place
quarterly or even semi-annually. Second, some factors are pro-cyclical with the business
cycle while others are historical defensive or not related to the business cycle. Pro-cyclical
is value, growth, momentum, size and liquidity. Defensive or of low volatility are factors
exploiting the volatility, yield and quality. This suggests that there should be a discretionary control about which factors should be included in the investment portfolio. Given
the periodicity of the cyclical behavior such a control should take place on an annual or
even bi-annual basis.
134
2.8.3.12
Factor-Based Asset Allocation vs. Asset-Class-Based Asset Allocation
Sources for this Section are Idzorek and Kowara (2013) and Martellini and Milhau (2015).
It is a widely documented fact that the pairwise correlations among risk factors are often
lower than those among asset classes. Does this imply that risk factors are superior to
asset classes?
Idzorek and Kowara (2013) first provide an answer in an idealized world where the
number of risk factors is equal to the number of asset classes where unconstrained mean
variance optimization is considered. The same dimensionality of asset classes and risk
factors implies a one-to-one relationship and then with no surprise, returns are the same.
The authors then consider a real world example. They focus on liquid US asset classes
and risk factors. The number of risk factors (eight) is not equal to the number of asset
classes (seven). The data set are monthly data starting Jan 79 until Dec 11. They first
confirm that the average pairwise correlation for risk factors without 0.06 and for asset
classes 0.38. A main reason is that the market portfolio is part of the asset classes but
not of the risk factors. The authors then consider two different time horizons to derive
the optimal allocations: Once they use the full time series and in the second case they
start in Jan 02 and end in Dec 11.
Figure 2.23 illustrates the findings for nonnegative weights which add up to 1. The
risk factor weights define a lower dimensional space that the asset classes weights since
there are more constraints for the risk factors - by construction risk factors are long/short
combinations. This lower dimensionality seems favoring ex ante the asset classes. But
it is in fact not possible to state which opportunity set is larger since the exposure of
risk factors can be 100%. This is excluded for asset classes which can be long only.
Summarizing, the opportunity sets are complex large dimensional subspaces of the total
asset space where it is not possible to find out in general which set is larger.
The results indicate that by cherry picking a particular historical time period, almost
any desired result can be found. This illustrates that there is nothing obvious about the
superiority of asset allocation based on risk factors. This result does not depend on the
fact that historical data are used. Idzorek and Kowara (2013).
2.8.4
Optimal Portfolio Construction: Markowitz
Markowitz stated, in 1952, the principle - The investor should consider expected return
a desirable thing and variance of return as an undesirable thing.
To operationalize this principle the objective of the investor is to select portfolios
according to the mean-variance criterion:
1. Either the investor chooses a portfolio to maximize the expected return where
volatility cannot exceed a predefined level , or
135
Figure 2.23: Optimal asset classes versus optimal risk factors. Left panel: Long time
series. Right panel: Short time series. The US asset classes are large value stocks, large
growth stocks, small value stocks, small growth stocks, Treasuries, mortgage backed
assets, credit and cash. The risk factors are market, size, value, mortgage spread, term
spread, credit spread and cash (Idzorek and Kowara [2013]).
2. Volatility is minimized such that the expected return cannot be lower than a predefined level r.
The solutions of each of these problems are equivalent. They are parametrized by the
predefined levels for the volatility and the return. The main conclusion is that a diversified portfolio allows investors to increase expected returns while reducing risks compared
to less diversified investments.
A third method is to use the quadratic utility function (2.55): The solution of the
mean-variance utility function optimization (2.55) is equivalent to the two above criteria
and is parametrized by the risk aversion, . All three problem formulations are smooth
convex quadratic optimization problems which possess a unique solution which can be
explicitly calculated using calculus.
We recall that portfolio consideration is an essential means of complexity reduction
in decision making. Focusing on the portfolio return and portfolio variance, only two
figures alone capture risk and return information. But to calibrate the model, still a
large number of returns and correlations has to be considered.
136
2.8.4.1
Motivation: The Two-Asset Case
Consider two assets X, Y with expected returns of X > Y and with X > Y , respectively the expected portfolio return where the two assets enter with weights
p := E[R] =
j j
j=X,Y
is additive. But portfolio risk, the variance, is not additive:

p2 =
2
X
2j j2 + 2X Y X,Y Y X .
(2.75)
j=X,Y
If we plot the expected portfolio returns and the portfolio standard deviation in the
(, ) portfolio space; for different portfolios different points follow. We start with two
portfolios B = (100%, 0%) and A = (0%, 100%) shown in Figure 2.24.
Figure 2.24: Portfolio frontiers in the two-asset case. The portfolio opportunity set is a
hyperbola in the portfolio coordinates expected return and standard deviation.
What can be said about general portfolios - that is, where a fraction of wealth is
invested in A and a 1 in B? Solving the mean-variance optimization problem shows:
The portfolio opportunity set is a hyperbola in the (, )-portfolio coordinates (line
3). It is maximally bowed for perfect negative correlation. In general, the lower
correlation is, the higher are the gains from diversification.
137
For perfect positive or negative correlation the hyperbola degenerates to straight

lines - equation (2.75) for portfolio standard deviation becomes a linear function in
the strategy. The straight line 1 between A and B represents all possible portfolio
choices if there is perfect positive correlation, +1. Similarly, for perfect negative
correlation all portfolios expected returns and standard deviations will lie either
on the straight line 2a or 2b. In the presence of perfect negative correlation we can
fully eliminate the portfolio risk while having long positions in both assets (point
C). In such a setting, asset A is a perfect hedging instrument for asset B (and vice
versa).
The following definitions are used.
Definition 2.8.3.
1. If a portfolio offers a larger expected return than another portfolio for the same risk, then the latter portfolio is strictly dominated by the first
one.
2. Portfolios that are not strictly dominated by another one are called mean-variance
efficient or minimum variance portfolios. The set of these portfolios form the
efficient frontier.
3. The portfolio m at point D is the global minimum variance (GMV) portfolio
- that is, the portfolio attaining the minimal variance risk in the set of all efficient
portfolios.
The straight lines 1 and 2 are efficient frontiers. For non-perfect correlation, the
hyperbola is only efficient between the points D and B.
2.8.4.2
Many Risky Assets
The two-asset cases generalizes to many assets. The assumptions of the Markowitz model
are:
1. There are N risky assets and no risk free asset. Prices of all assets are exogenous
given.
2. There is a single time period. Hence, any intertemporal behavior of the investors
can not be modelled.
3. There are no transaction costs. This assumption can be easily relaxed nowadays
since a Markowitz model with transaction costs can be numerically solved.
4. Markets are liquid for all assets. This assumption, which also essentially simplifies
the analysis, is much more demanding to remove than the absence of transaction
costs restrictions.
5. Assets are infinitely divisible. Without this assumption, we would have to rely on
integer programming in the sequel.
138
6. If borrowing and lending is excluded, then full investment holds i.e.

he, i = 1
with e = (1, . . . , 1)0 Rn .
7. Portfolios are selected according to the mean-variance criterion.
8. The vectors e, are linearly independent. If they are dependent then the optimization problem does not have a unique solution.
9. All first and second moments of the random variables exist. If this does not hold
then the mean and covariance are not defined and the whole optimization program
is not defined.
Proposition 2.8.4. We define a = h, C 1 i, b = he, C 1 ei, c = he, C 1 i, = ac b2
and

a c
A=
.
c b
Consider N risky assets and the above assumptions. Then the Markowitz problem defined
by
1
h, Ci
(M)
2
he, i = 1 , h, i = r .
minn
s.t.
(2.76)
has a unique solution

M V = r1 + 2
(2.77)
with

1
2
= A1
C 1
C 1 e

.
(2.78)
The proof using calculus is given in the exercises. Hence, portfolio weights are linear
in the expected portfolio return r. Inserting M V into the variance implies the optimal
minimum portfolio variance p2 -hyperbola:
p2 (r) = h , C i =

1 2
r b 2rc + a .
(2.79)
The strategy M V provides us with all dominant portfolios and hence the efficient frontier.
As in the two-asset case, the mean-variance frontier is a hyperbola in the ((r), r)portfolio coordinates. Diversification in the mean-variance model means that adding
more and more assets causes the efficient frontier to widen: for the same risk, a higher
expected return follows (see Figure 2.25).
139
Figure 2.25: Different efficient frontiers for different numbers of assets. It follows that
adding new assets allows for higher expected return for a given risk level (measured by the
portfolio standard deviation - Stdev). The portfolio with the lowest standard deviation
is the global minimum variance (GMV) portfolio (Ang [2012]).
2.8.4.3
Geometry of the Markowitz Problem
We discuss the geometric interpretation of the Markowitz problem, see Luenberger (2014)
and Rambaud et al. (2009). We consider the Markowitz model in the form where the
goal is to minimize the portfolio variance for N risky assets under the full investment
and desired expected return constraints, i.e. problem (2.76). The two constraints define
two subspaces Si , i = 1, 2 in RN each of dimension N 1. S1 is the plane of all vectors
RN which satisfy the return constraint h, i = r (see Figure 2.26). The intersection
S = S1 S2 defines the feasible set of dimension N 2. We define U1 as the space of all
vectors orthogonal to the expected return:
U1 = {y S1 |hy, i = 0},
and similar U2 in S2 , the space of all vectors orthogonal to the units vector e, is defined.
Therefore, the Markowitz problem is to find a portfolio M V S with minimum
value h, Ci. This is equivalent to find the point x S which has minimum distance
to the origin. It therefore suffices to find a point x S such that 0x = M V and any
= =
xy
M V S for all points y S are orthogonal, see the Figure 2.26.
140
S1
S
S2
y
j
jMV
Figure 2.26: Left panel: Geometry of the Markowitz problem for N = 3 assets (Rambaud
et al. [2009]).
Summarizing, the Markowitz solution is the orthogonal projection of any portfolio on
the feasible set S. This is the same as the intersection of S and the plane spanned by
two vectors 1, , 2, orthogonal to U1 and U2 , respectively. To find these two vectors
we introduce the inner product induced by the variance - covariance volatilities matrix
of the risky assets:
h, iC := h, Ci ,
which is a reasonable definition since the goal is to minimize the portfolio variance
h, Ci. Then, the vector 1, = C 1 is orthogonal to U1 since for all U1 :
h, C 1 iC = h, CC 1 i = h, i = 0
by definition of U1 . In the same way one shows that 2, = C 1 e is orthogonal to U2 .
Since the solution of the problem is an orthogonal projection, the vector M V is a linear
combination of the two found orthogonal vectors, i.e.
M V = 1 1, + 2 2, .
(2.80)
The parameters i are found by inserting the above combination in the two constraints
defining S. Solving this system provides us with the solution of the Markowitz problem
i.e. (2.80) becomes equal to (2.77). This concludes the derivation of the Markowitz
problem using elementary geometry.
141
Example
Consider three assets with expected returns (20%, 30%, 40%) and covariance
0.1 0.08 0.09

C = 0.08 0.15 0.07
0.09 0.07 0.25
We assume that the investor expects a minimum return of r = 30%. He could then
simply fully invest in asset 2 to achieve this return goal. But the optimization shows
that the optimal strategy also provides the return target but with lower risk. The optimal
strategy is
M V = (0.28, 0.43, 0.28)0
The investor is fully invested and long on all assets. The risk of the optimal portfolio
is p = 10.7 percent which is less than the 15 percent if the investor only invests in
the second asset. We compare the Markowitz portfolio with the equally weighted (EW)
portfolio and the risk-parity portfolio of inverse volatility (IV) - that is to say, investment
in each asset is inversely proportional to its volatility. We get
M V = (0.28, 0.43, 0.28)0 ,
EW = (0.33, 0.33, 0.33)0 ,
IV = (0.48, 0.32, 0.19))0 .
The MV strategy considers variances and covariances, the equally weighted strategy does
not consider them at all, and the risk-parity strategy only considers variances - that is
to say, part of the investment risk. The statistics for the three strategies are:
Strategy Expected return Portfolio P

MV
35.7%
10.7%
EW
29.7%
10.6%
IV
26.8%
9.6%
Example
An investor has mean-variance preferences if he optimizes a quadratic utility function
where the returns of the asset has an arbitrary return distribution. But there are other
type of investors which act as if they also have mean-variance preferences. If the payoffs
of the risky assets are multivariate normally distributed then an investor which does not
142
necessarily has a quadratic utility function will also rank the portfolios based on the
mean and the variance. The reason is that normal distributions are fully characterized
by their mean and variance.
Example
Consider the case of two assets with expected returns of 1 = 1 and 2 = 0.9,
respectively. The covariance structure is given by

0.1 0.1
C=
0.1 0.15
Asset 1 seems more attractive than asset 2. It has a higher expected return and lower
risk. Naively one would invest fully in the first asset. But it follows that the negative
correlation makes an investment in asset 2 necessary to obtain an optimal allocation. The
expected return constraint is set equal to r = 0.96. We consider the following strategies:
1 = (1, 0), full investment in asset 1.
2 = ( 12 , 21 ), an equal distribution.
3 = (5/9, 4/9), optimal Markowitz strategy without the expected return constraint.
M V = (0.6, 0.4), optimal Markowitz solution with the expected return constraint.
The following expected portfolio returns and risk for the different strategies hold:
Strategy
1
2
3
M V
1
0.95
0.955
0.96
P
0.1
0.0125
0.011
0.012
Although 1 satisfies the expected return condition, risk is much larger than for the
strategy 2 , which in turn does not satisfy the expected return condition. The risk of
3 is minimal but the return is smaller than required. Therefore, 40 percent has to
be invested in the not very attractive asset to obtain the optimal solution. This is the
Markowitz phenomenon: to reduce the variance as much as possible, a combination of
negatively correlated assets should be chosen.
143
Example Projections
We extend the geometric view of the Markowitz problem. Consider a sequence of
portfolio 1 , . . . , k which are all linearly independent. The following algorithm of GramSchmidt allows us to construct a new sequence of k portfolios m , m = 1, . . . , k, which are
not correlated. he first step in the algorithm is to define 1 = 1 . The second portfolio
2 has to be orthogonal to 1 . This is achieved if we project 2 on the orthogonal
complement of the space spanned by 1 . Formally, if Px (y) denotes the orthogonal
projection of a vector y on a vector x, then
Py (x) =
hy, xi
y.
hy, yi
(2.81)
Note that a linear mapping P is an orthogonal projection of a real vector space if P 2 =

P, P 0 = P . The first condition means that projecting a yet projected vector on the
same space has no impact and the second condition assures the orthogonality of the
projections. If we project x on a subspace spanned by some vectors y1 , . . . , yn , then in
the projection (2.81) the right hand side is replaced by a summation over the individual
projections. If we denote by y the orthogonal vector to y, we always have that
Py (x) + Py (x) = x , Py + Py = 1 .
(2.82)
Therefore,

h1 , 2 i
2 = P(1 ) (2 ) = 2 P1 (2 ) = 2
1 .
h1 , 1 i
This is continued for the next vector by projecting the third vector on the orthogonal
complement of the first two orthogonal vectors, etc.
The coefficients hy,xi
hy,yi in the orthogonal projection have a well-known financial interpretation. We recall that the for y, x two square-integrable random variables the inner
product is defined by integral (or expected value)
hx, yi = E(xy) .
(2.83)
hx, xi = E(x2 ) = 2 (x)
(2.84)
hx, yi = E(xy) = cov(x, y) .
(2.85)
But then
Hence, the projection coefficient is a Beta:

Py (x) = x,y y .
(2.86)
144
Example Conditional expectation as orthogonal projection

Conditional expectation is an important example of an orthogonal projection. Let
X, Y be random variables which are square integrable. Let Ft the information set at
time t. We consider E(X|Ft ). This is a random variable. The law of total probability
states:
E(E(X|Ft )) = E(X) .
Let Y be Ft -measurable - Y is known at time t. But then
Y E(X|Ft ) = E(Y X|Ft ))
and therefore,
E(Y E(X|Ft )) = E(Y X) .
This last equation reads equivalently:
E((X E(X|Ft ))Y ) = 0 .
(2.87)
Therefore, the random variable X E(X|Ft ) is orthogonal to all other random variables
defining the set Ft and E(X|Ft ) is the orthogonal projection of X on this set. Since
the orthogonal projection has the minimum distance to the above set it is the best
approximation of X given the information set at time t.
Consider the martingale process in (2.18) E Q [Rt+s |Ft ] = Rt . This shows that martingales are best predictions of future returns given a present information set and that
the best prediction is equal to the present known return. Geometrically, a martingale
is an orthogonal projection of any future return on the present information set with the
result that the present return vector follows.
How can we characterize the efficient frontier in terms of the expected returns, variances, and covariances of the returns? The impact of one unit more return in asset k on
the optimal variance equals the covariance of asset k with the minimum variance portfolio. If asset k is positively correlated with the portfolio, a unit more return of this asset
increases the variance. The opposite holds if the correlation is negative.
2.8.4.4
Mutual Fund Theorem
An important result is the so-called mutual fund theorem.

Proposition 2.8.5. Any minimum variance portfolio can be written as a convex combination of two distinct minimum variance portfolios.
Formally, if M V (r) is any optimal minimum variance portfolio, then there exists a
function (r) for any two other optimal minimum variance portfolio, 1 (r), 2 (r), such
145
that
M V (r) = 1 (r) + (1 )2 (r).
(2.88)
In other words, the entire mean-variance frontier curve can be generated from just two
distinct portfolios. This results follow from the geometric fact that the efficient frontier
is a one-dimensional affine subspace in Rn . The Mutual Fund Theorem allows investors
to generate an optimal portfolio by searching for cheaper or more liquid portfolios and
invest in these portfolios in the prescribed way.
2.8.4.5
Markowitz Model with a Risk-Free Asset
So far all assets in the Markowitz problem have been assumed to be risky. If we assume
that one asset is risk-less and the other ones are risky, the whole optimization program
of Markowitz can be repeated. Many properties of the risky asset case carry over to the
case with a risk-free asset.
But the efficient frontier is a straight line. This follows at once if one considers the
two asset case with a single risky asset and risk-free asset. The straight line has to have
at least one point in common with the efficient frontier where all assets are risky. This
is the case if the optimal strategy is to invest zero wealth in the risk-free asset. The
portfolio where the two frontiers intersect is the tangency portfolio T (see Figure 2.27;
left panel).
Therefore, natural candidates for the mutual fund theorem are the tangency portfolio
and the risk-less-asset investment. In the right panel of Figure 2.27, the investment
situation is shown if there are bonds, stocks, and cash with the corresponding risk and
return properties. A mean-variance investor chooses a portfolio on the straight line
efficient frontier. The investors on this line can add cash to become more conservative
or even borrow cash for an aggressive investment. But none of them will alter the
relative proportions of risky assets in the tangency portfolio. The following proposition
summarizes.
Proposition 2.8.6. All assumption of Proposition 2.8.4 hold. There is one risk less
asset with return Rf apart of the N risky assets. The optimization problem then reads
min
s.t.
1
h, Ci
2
he, i = 1 0 , h, i = r Rf 0 .
(2.89)
The model has exactly one solution

= C 1 ( Rf e) =
r Rf
r 0
=:
.
2
R
a 2Rf c + Rf b
The minimum variance portfolio (tangency portfolio) with zero investment in the risk less
asset (0 = 0) is given by
T =
0
C 1 e
1
C 1
C 1 e = 0 1
c 0 b
c 0 b
e C e
(2.90)
146
Figure 2.27: Mean-variance model with a risk-free asset. Left panel - straight line efficient
frontier, which is tangential to the efficient frontier when there are risky assets only. The
tangency point T is the tangency portfolio where investment in the risk-free asset is zero.
Right panel - investors preferences on the efficient frontier. Moving from the tangency
portfolio to the right, the investor starts borrowing money to invest in the risky assets.
The investor is short cash in this region to finance the borrowing amount.
with e := Rf e, the excess return vector. The locus of minimum variance portfolios
is given by
r Rf
R (r) =
.
R
2.8.4.6
(2.91)
Tangency Portfolio, Capital Market Line
The efficient frontier is also called the capital market line (CML). The meaning of the
initials CML will become clear when we discuss the CAPM model. Geometry implies
that the mean p and standard deviation p of any efficient portfolio can be written in
the form
T Rf
p
p = Rf +
T
with T , T the expected mean and standard deviation of the tangency portfolio, respec R
tively, and. The slope of the CML (the Sharpe ratio) TT f is the price of risk of an
efficient portfolio.
147
Example
Suppose that the expected return on the tangency portfolio is 12%, the risk-free rate
is 2%, and the standard deviation of the tangency portfolio is 2%. The expected return
for a portfolio on the CML with standard deviation of 1% is then p = 2% + 5% = 7%.
The choice of the point on the CML in the right panel of Figure 2.27 depends on the
investors preferences in (2.55). The higher the risk aversion, the closer is the point in
the CML to the risk-free investment. Ang (2012) estimates the risk aversion parameter
as follow. He first calculates the optimal minimum variance portfolio using USA, JPN,
GBR, DEU, and FRA risky assets only. Then he adds a risk-free asset and searches for
the point on the CML that delivers the highest utility. This point implies a risk aversion
of = 3. The optimal portfolio with a risk-free asset can be seen in Figure 2.27 in the
region where the aggressive investor is shown. This means that the investor is long on
all risky assets and short on the risk-free asset.
But in reality, only half of investors invest their money on the stock market and
the remainders keep their money risk free. In some European countries stock market
participation is lower than 10 percent. This is the non-participation puzzle of meanvariance investing.
2.8.4.7
Mean-Value-at-Risk Portfolios, Uncertainty
One critique of the mean-variance criterion for optimal portfolio selection often concerns
the variance as a symmetric risk measurement: Why penalize the upside in portfolio
selection? Also, the variance is not seen as a true measurement of risk since it fails to
detect the states that reflect stress situations. Academic research defined many other
risk measures, including value at risk (VaR), expected shortfall, and stressed VaR.
Example Stress periods for Swiss stock market

Table 2.13 reports data about periods when Swiss stock market faced a stress. Besides the maximum drawdown, the time period where prices were falling and when they
rebound are shown. The last two periods represent the global financial crisis and the
dot-com bubble, respectively. A pattern, which is also observed in other markets, is that
on average it takes longer for the markets to recover than to drop. A second result are
the heavy maximum drawdowns. Therefore, for a mean-variance investor timing is an
issue even if the assets are diversified. This illustrates that also in an optimal portfolio
choice the evaporation of diversification - that is correlations become close to 1 - in time
of market stress happens.
148
Period
Low
MDD %
yfp
yrp
1928-1941
1935
41.3
7
6
1961-1968
1966
37.5
5
2
1972-1979
1974
47.2
2
5
1989-1992
1990
20.2
1
2
2000-2005
2002
42.3
2
3
2007-2013
2008
34.1
2
5
Table 2.13: Periods involving large drawdowns in Swiss equity markets. The drawdown is
the measurement of the decline from a historical peak. The maximum drawdown (MDD)
up to time T is the maximum of the drawdown over the overall time period considered,
yfp means years with falling prices and yrp years with rising prices (Kunz [2014]).
We consider mean-VaR portfolio optimization instead of mean-variance optimization.
V aR(a) is the minimum dollar amount an investor can lose with a confidence of 1 a
for a given holding period where the portfolio is not changed (see also Section 2.5.5.1).
If the portfolio returns are normal, the dollar amount V aR(a) reads
V aR(a) = k(a) + ,
where is the portfolio return, the volatility of the portfolio return, and k(a) is a
tabulated function of the confidence level 1 a.
Example VaR
Consider a position with a value of USD 1 million. Assuming normality of returns,
the goal is to calculate the one-day VaR on the 95 percent level. The estimated daily
mean is 0.3 percent and the volatility is 3 percent. The confidence level function k(a)
has the value 1.6449 and
V aR(a) = (1.6449 0.03 + 0.003) USD1mn = USD52, 347.
Therefore, on average in 1 out of 20 days the loss is larger than the calculated VaR of
USD 52, 347.
The above formula shows that under normality, VaR is proportional to the volatility.
This translates into the optimization problem: One can simply use the mean-variance
approach instead of the mean-VaR approach by rescaling the volatility. Therefore, if
returns are normal, not much is gained if one replaces the variance by the value at risk
in portfolio management.
The next example illustrates the how severe market downturns were for the SMI in
the last decade and it compares the time when the markets fall with the time needed for
recovery of the market.
Average
36
2.86
3 .57
149
The above comment that not much is gained by using mean-VaR instead of meanvariance also extends to the issue of adding uncertainty to the mean-variance model.
Such model extensions are mostly of the same type - the probability distribution of the
return is not known and one often assumes that nature will select the worst probability
for the investor. Therefore, these models essentially shift optimal investment to be more
risk-less than without uncertainty. This is, then, equivalent to allowing for a larger risk
aversion in the Markowitz model. Rebonato and Denev (2013) discuss this issue in more
detail.
Example Normality vs non-normality

We always assumed above that returns were normally distributed. How accurate are
these assumptions? Mandelbrot wrote 1963 abut the Brownian motion model (that is
normal distribution assumption): it does not account for the abundant data accumulated
since 1900 ... simply because the empirical distribution does not fit the assumption of
normality.
In 1964, Paul Cootner (MIT-Sloan) added: Mandelbrot, like Prime Minister Winston
Churchill before him, promises us not utopia but blood, sweat, toil and tears. If he is
right, almost all of our statistical tools are obsolete ... Surely, before assigning centuries
of work to the ash pile, we should like to have some assurance that all our work is truly
useless.
The use of the normal distribution boosted with the work of Black and Scholes the
whole derivative industry after 1973 although it was clear from the beginning that the
model violated some key observations in the derivative markets. Most prominent is the
constancy of volatility assumption in the model which is not observed in reality.
A second type of models whose elegance is based on the normal distribution
assumption and which became famous are those based on Lis 2000 paper On Default
Correlation: A Copula Function Approach for the pricing of collateralized debt
obligations (CDO). The simplicity and elegance of the formula of Li allowed it to
be used for a wide range of CDO pricing applications. Li himself wrote in 2005
that However, there is little theoretical justification of the current framework from
financial economics....We essentially have a credit portfolio model without solid credit
portfolio theory. The disaster of the GFC then showed that the model is fundamentally flawed leading to such articles in 2009 entitled The Formula that Killed Wall Street.
The CDO example highlights that the normal distribution is not adequate to model
a situation where extreme events are important. More precisely, measuring the coassociation between securities using correlation is not meaningful since correlation is not
predictable, that is the correlations between financial quantities are notoriously unstable.
150
This shows that in many applications risk and returns are not normally distributed.
Nevertheless, it is still common to work in a normal setup for educational reasons - closed
form analytical solutions are often lost when using non-normal distributions and therefore
basic economic insights are less transparent. Other reasons why normal distributions are
still used in asset management are (i) the focus on long time horizons and (ii) the fact that
many other types of model risk or uncertainty impact for example portfolio construction.
2.8.4.8
Comparing Mean-Variance Portfolios with Other Approaches
We follow Ang (2012). Consider four asset classes - Barcap US Treasury (US govt
bonds), Barcap US Credit (US corporate bonds), S&P 500 (US stocks), and MSCI EAFE
(international stocks) for the period(EW) 1978 to 2011. Different portfolios are chosen
monthly and for the estimated parameters the past five years of data are used. The
following strategies are compared:
Mean-variance (MV).
Market weights, which are given by the market capitalizations of each index.
Diversity weights, which are transformations of market weights using entropy as a
measure of diversity.
Equal capital weights (EW).
Risk parity (RP). The optimal portfolio weights are chosen proportional to the
inverse variance or volatility. The higher the risk, the lower the weight in the asset
class. This approach mimics the empirical fact of negative leverage in the markets
- if asset prices fall, volatility rises and vice versa. This strategy is sensitive to the
assets volatility but it ignores the correlation structure.
Equal risk contribution (ERC). The weights in each asset position are chosen such
that they contribute equally to the total portfolio variance.
Minimum variance portfolio (MVP), i.e. the MV optimization only considers the
variance and not the returns.
The Kelly rule. This rule maximizes the expected log return, which leads to a
maximization of the growth rate of wealth in the long run, see Section 3.4.5.2 for
a short discussion of growth optimal portfolio strategies.
The mean-variance portfolio is the strategy with the worst performance: choosing
market weights, diversity weights, or equal weights leads to higher returns and lower
risk. These results are a disaster for the mean-variance approach. A reason for the
outperformance of the global minimum variance portfolio versus standard mean-variance
weights and the market portfolio is that there is a tendency for low-volatility assets to
have higher returns than high-volatility assets.

Strategy
Mean-variance
Market weights
Diversity weights
EW
Rp
MVP
ERC
Kelly rule
Return
6.06
10.25
10.14
10
8.76
7.96
7.68
7.97
Volatility
11.59
12.08
10.48
8.66
5.86
5.12
7.45
4.98
151
Sharpe ratio
0.07
0.41
0.46
0.54
0.59
0.52
0.32
0.54
USD 100 after 33 years

697
2,503
2,422
2,323
1,598
1,252
1,149
1,256
Table 2.14: Risk and return figures for the different investment strategies. (Ang [2012]
and own calculations).
2.8.4.9
Estimation of the Covariance Matrix; Introduction
When one estimates the covariance matrix, the first approach is to assume that the asset
returns are normally distributed with an unknown constant mean and covariance C or
correlation matrix . Using the maximum likelihood function, the estimator of is
given by the empirical correlation - that is, the corrlation matrix that follows from the
asset return data
T
1X
=
Rt Rt0
(2.92)
T
t=1
with Rt is the demeaned, standardized realized return vector at time t. This estimation
method is widely used in practice.
When is the estimate close to the true matrix ? Intuitively, if the number of
assets N is small and the number of observations T is large, should be close to the
true value . If N increases but the number of observations for the estimate T does not,
then the estimation error - the estimate value is different from the true value - and the
out-of-sample performance both worse, see below. Ledoit and Wolf (2003) estimate that
for N assets around T 10N observations should exist to control for estimation error.
Formula (2.92) assumes IID gaussian returns which is not verified in financial time
series. The procedure is generalized to account for the time variability of asset returns
variance. The research approach that considers this generalization of the method (2.92) is
that of the GARCH, or generalized auto-regressive conditional heteroscedasticity models.
These models assume that the return of the asset at time t is equal to a deterministic
drift and a stochastic noise part t , where t = t zt with z the random variable (standard
normal) and t the time-varying volatility. It is then assumed that this volatility depends
on past variances. Therefore, the conditional variance of the noise term at time t depends
on past values of the noise term. The GARCH model allow for persistent volatilities:
a strong move of the return at time t triggers an increase in the noise at time t + 1,
which in turn leads to a higher probability that the return at time t + 1 will also face a
152
strong impact. The estimation of the covariance is more complicated for GARCH models
than for the standard normal approach. In this type of model extensions of using for
example multivariate Student distributions, the estimated correlation matrix satisfies
more complicated functional form (if there is any analytic closed form solution) than in
(2.92).
Example Mean-variance model and data requirements
Consider the Markowitz model with N risky assets. To implement the model, we
need N estimated expected returns and N (N 1)/2 estimated covariances. The total
number of input parameters is therefore 2N + N (N 1)/2 with 2N representing the
returns and the variances. For 100 assets, 5, 150 parameters are needed.
Besides complexity due to this large number of required parameters, the accuracy
of the estimates is a second issue: without extremely long data series, the standard
deviation of the estimated returns turns out to be larger than the estimated return itself
- that is to say, a useless estimate. To understand this, consider the estimate of the
return using monthly data. Writing R(j) for the rate of return in the past month j, the
average of n such observations is an estimate for the return. Assuming IID returns, this
estimate has itself a mean R - the true value - and a standard deviation / n - again
the true value.
If the stock has an annual return of 12 percent, the true monthly value is R =
12/12% = 1%.For a monthly standard deviation of = 0.05 the standard deviations
estimate 0.05/ 12 = 1.44% follows. But this estimate is larger than the mean itself.
Using n = 60 (i.e., five years of data), the standard deviation estimate is 0.00645, which
is only a little smaller than the mean. If we would like to have a standard deviation
of, say, 1/10 of the mean, the equation 0.05/ n = 0.001 implies n = 2, 500, which
corresponds to a time series of more than 208 years ( 2,500/12). It is therefore important
to derive simpler models that are not so data intensive.
The idea of using a factor model is to reduce the correlation complexity but not to
change the volatilities. Reducing complexity means reducing the number of free parameters in the covariance matrix, which in turn reduces the statistical error, see Figure
2.28. The risk of this approach is that one does not capture in the low dimension model
all asset covariations. This leads to a potentially systematically biased estimate of the
return covariance matrix.
Since the hope is to remove redundancy or duplication from a set of correlated variables, factors should be relatively independent of one another. If we have N assets, the
dimension of the covariance matrix N (N 1)/2 is reduced to m + N (m + 2), if there are
m factors.
153
Figure 2.28: Illustration of complexity reduction, from data to covariance to factor analysis.
Example
Consider the following correlation matrix:
1
0.09
1
=
0.02 0.12
1
0.01 0.18 0.94 1
The matrix indicates that the first and second assets as well the third and fourth
assets are driven by the same risk factor. The other correlations are also of the same
order of magnitude. Instead of considering (4 3)/2 = 6 correlations, one would start
with a two-factor model, which is less complex, see Figure 2.28.
If there are N assets and m risk factors F , the factor model is fixed by the (N m)
matrix A- the loadings matrix. A general linear returns model is of the form
Ri,t = A0i Ft + i,t
(2.93)
where one requires that the noise term is not auto-correlated, has zero mean and is also
uncorrelated to the factors: E[0 ] = E[] = E[F 0 ] = 0.
The hope is that:
154
The common factors - that is to say, the first term in (2.93), explain most of the
randomness in the market.
The residual part, the second term in (2.93), has only a marginal effect.
The dimension of the factor vector F is much smaller than the dimension of the
return vector R.
The dynamics (2.93) leads with the assumptions to the equations
E(R) = A0 E(F )
0
C = AIA + D.
(2.94)
(2.95)
D is the diagonal idiosyncratic covariance matrix with the variances of the idiosyncratic
risks as entries and I the identity matrix. How does one find the factors F satisfying
the above assumptions? One method is to use the Principal Component Analysis
(PCA) - or some more refined approaches - which indicate how many factors are needed,
see below.
How is the factor loading matrix A found? Geometrically, the matrix A is found
by an orthogonal projection of the returns on the set generated by the factors. This
projection is the beta in the CAPM, the betas in general beta pricing models or the
factor risk premia in the APT model. Analytically, A is given by the eigenvectors of the
PCA, see below.
Example Roncalli (2104)
Consider the S&P 500, SMI, Eurostoxx 50, and Nikkei 225 indices from April 1995 to
April 2015. Calculating the correlation matrix on a weekly basis using the closing prices
we get
1
0.8
=
0.82 0.88
1
0.67 0.56 0.58 1
The data indicate that the correlation between the European and American markets is
stronger than that between the Japanese market and the European or American one. We
therefore set up a two-linear-factor model where we allow for this observed difference in
the estimated correlation.
The matrix A follows from the likelihood estimation

0.015 0.21 0.29 0.35
A=
.91
0.93 0.96 0.76
Therefore, the portfolio is long only in one factor, which is the market factor by definition,
and long/short in the second factor. The portfolio is from a indices vista short in the
S&P 500 and long in the other three indices.

2.8.4.10
155
Estimation of the Expected Return
Estimating the expected return is more difficult than estimating the covariance matrix.
The fundamental pricing equation - see Chapter 3 - states that changes in asset prices
are driven by changing expectations of the cash flows, changing correlations between the
assets, changes in the discount factors, or a combination of all these factors. Given these
possible changes of the different factors that affect the value and hence the return of an
asset, the first question is whether one can forecast asset returns at all. This question,
which is known as the market efficiency question, is difficult to answer and the answer
has changed in the literature during recent decades.
Assuming that one can forecast asset returns in a statistical sense, one has to decide
the time horizon of the forecast period. Intuitively, the shorter the time horizon the more
uncertain is a forecast and the more the assets return can deviate from any long-term
equilibrium value. It is in this period investors search most often for anomalies in the
markets, which are assumed to persist only for a short time.
We refer the reader to Ilmanen (2012) for a discussion of estimating the expected
return.
2.8.4.11
Stability of the Optimal Portfolio
Given the estimated returns and covariances, one faces the non-stability problem in optimal portfolio construction: Small changes in the estimated input parameter can lead
to large changes in the optimal portfolios that are difficult to explain and accept. To
motivate the instability, consider the optimal strategy = 1 C 1 in the Markowitz
model. The covariance matrix enters the optimal portfolio by its inverse. If correlations
are small numbers, which means that the risk sources are only weakly dependent and
hence are desired from a diversification point of view, small variations in these numbers
lead to large changes in the optimal portfolio.
Example
Ang (2014) estimates the original mean-variance frontier using data from January 1970
to December 2011. The mean of US equity returns in this sample is 10.3 percent. Ang
then changes the mean to 13.0 percent. Such a change is within two standard error
bounds. The minimum variance portfolios for a desired portfolio return of 12 percent are
then:
Changing the US mean to 13.0 percent has caused the US position to change from -9
percent to 41 percent, and the UK position to move from 48 percent to approximately 5
percent.
156
USA
JPN
GBR
DEU
FRA
US mean = 10.3%
-0.0946
0.2122
0.4768
0.1800
0.2257
US mean = 13.0%
0.4101
0.3941
0.0505
0.1956
-0.0502
Table 2.15: MV portfolios for two different expected equity returns (Ang [2014]).
Several approaches have been developed in recent decades to stabilize these results.
Jorion - for example - applied, in 1992, the resampling technique. This technique simulates the optimal portfolio using the estimated mean and covariance. This generates in
the portfolio risk and return space a cloud of optimal portfolios, which scatter around the
true efficient frontier. The simulated portfolios are averaged, which defines the optimal
portfolio after resampling. This method has no theoretical foundation but is considered
an empirical method for correcting portfolio instability.
Another approach is to de-noise the covariance matrix. Consider a covariance matrix C of any dimension N N . The matrix does not tell us how much the unobservable
risk drivers of the N assets add to the total portfolio variance. Is there a method that
allows us to derive from any covariance matrix how important the N risk factors are in
explaining portfolio risk? Yes, the principal component analysis (PCA).
To understand the intuition, consider Figure 2.29. In the left panel the closing Dow
and S&P 500 index values are shown. The plot shows that the two series are dependent.
Pick any data point and move to the data point of the next day - the move will be diagonal and not vertical or horizontal. That is, each move in the closing prices has a Dow
and an S&P component. Therefore, the volatility of the joint times series is generated
by the volatilities of both single time series.
PCA, then, means rotating the coordinate system as from the x to the y system in
the right panel. In this new coordinate system, the data points have almost no variance
in the y2 direction but a strong one in the y1 direction. Therefore, the y1 -direction factor
explains a most of the portfolio variance. Working only with the y1 factor, then, means
capturing most of the portfolio risk while setting aside the noisy factor. The matrix that
represents this unbundling is, contrary to the covariance matrix, a diagonal matrix where
the entries are called eigenvalues.
The reader who is not interested in a mathematical description can skip the next
parts and continue reading following the next example below. This transformation from
C to a diagonal matrix is always possible and constructively described by the spectral
theorem of linear algebra. It states that there exists a matrix W such that C can be
157
Figure 2.29: Closing values for the S&P 500 and Dow Jones Index in 2006. The red
coordinate systems denote the rotation applied in PCA.
diagonalized as follow:
W 0 CW = ,
(2.96)
This is referred to as principal component analysis (PCA). The diagonal elements of

are the real-valued positive eigenvalues 1 , ..., N . This follows from the property that
covariance matrices are positive definite. The eigenvalues are calculated by solving the
polynomial equation det(C I) = 0 with I the identity matrix. Given any eigenvalue
k of the covariance matrix, the solution of the linear equation Cvk = k vk implies the
eigenvector vk which form an orthonormal basis. The matrix W , which diagonalizes the
covariance matrix C, is the juxtaposition of the eigenvectors - that is, W = (v1 , ..., vN ).
We state some further properties:
The eigenvectors can be interpreted as the factor load matrix A in (2.93).
The eigenvalues explain the variance of the factors. Using the diagonalization of
158
the covariance matrix we can write:
p2
h, Ci
h, W 0 W i
hW , W i
=: h, W i
X
=
i i2 .
(2.97)
Factors with very low eigenvalues add only little to the portfolio risk and are therefore avoided - this is the de-noising of the covariance matrix.
But those eigenvalues that are important from a risk perspective are the least
important ones from a portfolio optimization perspective. Consider the optimal
Markowitz solution (2.56) = 1 C 1 . Here, not the covariance matrix but its
inverse, the information matrix, matters. But the eigenvalues of the information
matrix are the reciprocal values 1/k of the eigenvalues k .
Therefore, the most important eigenvalues or factors in portfolio optimization are those
that, from a risk perspective, are considered noise. This is one reason why portfolio
managers often do not use portfolio optimization methods.
Example - Principal component analysis (PCA)

The two indices in Figure 25 show a strong positive dependence. In other words, there
must be a common factor. PCA shows that the y1 asset has a strong volatility while
the y2 asset is almost free of any volatility. This is reflected in the eigenvalues too. One
eigenvalue in the rotated system (2.96) will be large since only one factor is responsible for
the variance, and the other one will be small. We show how to calculate the eigenvalues
and eigenvectors. Consider the matrix

2.25 0.4330
M=
.
0.4330 2.75
This matrix is symmetric. Solving the eigenvalue equation
det(M I) = (2.25 )(2.75 ) 0.43302 = 0
we get the two eigenvalues, = 3 and = 2. Therefore, matrix M is also positive definite
and satisfies all the mathematical properties of a covariance matrix. The information
matrix M 1 has the inverse eigenvalues 1/3 and 12 on its diagonal, which shows that the
ranking order of the eigenvalues of M is reversed if one considers the information matrix.
Solving the two linear systems for the eigenvectors implies
v1 = (1.73205, 1)0 , v2 = (1, 1.73205)0 .
Forming the scalar product, it follows that the two vectors are orthogonal.
159
Example - Eigenportfolios
Consider a portfolio where the weights of the different assets are given in terms
of the components of the eigenvectors vk of C. This defines the eigen-portfolio Ve .
The realized portfolio risk of this portfolio is then given by the eigenvalues, see also
(2.97). It follows from the orthonormality of the eigenvectors that the returns in the
eigen-portfolio are uncorrelated. Therefore, the eigen-portfolio weights list describes
uncorrelated investments with decreasing portfolio risk.
Example - PCA and optimal portfolios

Measuring the risk of a portfolio by using the empirical correlation matrix, where the
weights are independently chosen of the past returns
X
2
p,
i ij j
=
i,j
defines an unbiased estimate of the portfolio risk with small mean-square proportional
to 1/T . This example is based on Bouchaud and Potters (2009). The situation is
different for optimized portfolio where we consider the Markowitz model without the
full investment constraint; the optimal policy reads if we assume that is known:
M V = r
1
.
h, 1 i
(2.98)
The true minimal risk is then

2
2
M
V = hM V , M V i = r
1
.
h, 1 i
(2.99)
If one uses the in-sample estimated correlation matrix or the out-of-sample matrix the matrix constructed by the observations observed at the end of the investment period
- are used, the portfolio risk reads:
2
2
M
V,in = r
1
1
1 i
2
2 h,
,
=
r
.
h, 1 i M V,out
(h, 1 i)2
(2.100)
If the posterior estimate is equal to the true one, then the risk of the out-of-sample
estimate is equal to the optimal one. Assuming that the in-sample estimate is not biased
(the average value is equal to the true value), then convexity properties for positive
definite matrices imply the first inequality
2
2
2
M
V,in M V M V,out .
(2.101)
160
The out-of-sample risk is larger than the optimal one and the in-sample risk underestimates true risk. How far away are the in- and out-sample risk from true risk? Pafka and
Kondor (2004) show that for IID returns and large portfolios
2
2
M
V,in = M V
p
N
2
1 q = M
V,out (1 q) , q =
T
(2.102)
holds. All risk measures coincide if q = 0 - the number of observations T is much larger
than the number of assets N . For q = 1, the in-sample risk becomes zero which is the
case of the severest risk underestimation.
Example (Roncalli [2014])

Consider three assets with the following return and covariance properties (the benchmark
case):
Returns are 8%, 8%, and 5%.
Volatilities are 20%, 21%, and 10%.
Correlation is uniform at 80%.
Table 2.18 shows that the optimal portfolio allocation of the benchmark is not stable
if one increases the correlation to 90, or
reduces the volatility of asset 2 to 18 percent, or
increases the expected return of asset 1 from 8 percent to 9 percent.
All figures in %
Asset 1
Asset 2
Asset 3
Portfolios
Benchmark
38
20
42
New up to 90%
45
9
46
2 up to 18%
14
66
30
1 up to 9%
60
-5
45
Table 2.16: Stability issues.
Table 2.17 and Table 2.18 show the result of PCA for the covariance and information
matrix. The first factor in the covariance matrix is a market factor since all components
in the eigenvector are positive. This factor has the largest eigenvalue and contributed
161
88 percent to the portfolios volatility. The second factor adds another 9 percent and
factor 3 only contributes 3 percent. In the information matrix the role is reversed, which
illustrates the trade-off between risk management and optimization.
All figures in %
Asset 1
Asset 2
Asset 3
Eigenvalue
Cumulated p -contribution
Factor 1
65
70
30
8
88
PCA of C
Factor 2 Factor 3
-72
-22
69
-20
-2
95
0.8
0.3
97
100
Table 2.17: PCA analysis of covariance matrix.
All figures in %
Asset 1
Asset 2
Asset 3
Eigenvalue
PCA of C 1
Factor 3 Factor 2 Factor 1
-22
-72
65
-20
69
70
95
-2
30
380
119
12
Table 2.18: PCA analysis of information matrix.

Other methods used to stabilize the optimization problem include so-called shrinkage
and the penalized regression technique. We refer to Ledoit and Wolf (2003) for an explanation of the former one. We consider penalization techniques in the big data Section
4.10.
We note at this point an important result of Jagannathan and Ma (2003): The solution of a linear weight-constrained optimization problem is the same as the solution of
an unconstrained problem if the covariance matrix is shrunk or if one introduces relative
views such as in the Black-Litterman model.
The above de-noising techniques are not sufficient for obtaining the stability of the
solution due to the mentioned trade-off between risk management and the portfolio optimization view. Some practitioners therefore prefer to introduce explicit restrictions into
the optimization problem such as bounds on short selling, bounds on the asset allocation
components, bounds on the tracking error, etc. This approach has drawbacks:
One loses the analytical tractability of the optimization problem - that is, one has
to solve the problem numerically.
162
Each restriction has an economic price. If the restriction binds, the optimal value
of the unconstrained problem is reduced.
Compare two constraint models. Is one allocation better than the other because
of a better model or because of the chosen constraints? Constraints are ad hoc,
discretionary decisions that impact a models performance in a complicated way.
Is there a less ad hoc method for stabilizing the portfolios? The risk budgeting
approach presented in the next section is such a method.
2.8.5
Optimization, SAA, TAA and Benchmarking
This section follows Leippold (2011), Lee (2000) and Roncalli (2014).
2.8.5.1
SAA and TAA
Consider the optimization problem (2.55):

max h, i h, Ci
2
where we assume the full investment condition. Then, the solution can be written as a
sum of the GMV portfolio and a second portfolio X which is proportional to :
= GM V + X .
To introduce the SAA, we use the unconditional long-term (equilibrium) mean of the
returns. Adding and subtracting the long-term mean
in the second component, the
solution can be written after some algebra in the form:
= GM V + S + T .
(2.103)
The second and the third component are the SAA and the TAA component, respectively.
The sum of the three components is an efficient portfolio.
Each SAA component j,S is proportional to
j
k for k 6= j. If the long-term forecasts of all assets are the same, the SAA component is zero. If the long-term forecasts
differ, the holdings are shifted to the asset with the higher equilibrium return. The size
of pairwise bets depend on the relative risk aversion and the covariance C which enter
S . The sum of the GMV and the strategic portfolio is called the benchmark portfolio in the asset management industry and the strategic mix portfolio in investment theory.
For each TAA component, j,T is proportional to
j
j (k
k )
for k 6= j. Hence, there are again bets between the assets case where there are no
bets against the same asset and the bets are of an excess return type with the SAA as
benchmark. For N assets, there are N (N 1)/2 bets. As in the SAA case, the bets are
weighted by the covariance matrix and the relative risk aversion.

2.8.5.2
163
Active Investment and Benchmarking
The investor so far considered an absolute return approach where he or she cares about
the absolute wealth level. Consider now an investor who cares about investment relative
to a benchmark b.
Then, the tracking error difference e between an active managed portfolio and the
benchmark b is the return difference
e = R() R(b) = ( b)0 R ,
(2.104)
:= b
(2.105)
where the difference

is the vector of active bets of the investor. Taking expected value the expected tracking
error difference reads
(, b) = ( b)0
(2.106)
follows. The tracking error is by definition the volatility of the tracking error difference:
p
TE = (, b) = (e) = ( b)0 C( b) .
(2.107)
The investor chooses the bets such that the quadratic utility is maximized:

max h, i h, Ci
(2.108)
Assuming full investment, the solution of this active risk and return program can be
written as a sum of two parts. One part is given by the benchmark and the second one
by the bets. But in general, the bet vector is different from the tactical asset allocation
vector in last section. The next proposition summarizes.
Proposition 2.8.7. Consider the active risk and return optimization in (2.108) with the
full investment constraint. The efficient frontier are straight lines in the ((, b), (, b))space. Inserting further linear constraints, the efficient frontier are non-degenerate hyperbolas.
If we invest the fraction of wealth in an active strategy a and 1 in the
benchmark, that is we consider the strategy
= a + (1 )b ,
then
(, b) = IR(a , b)(, b) ,
(2.109)
where the Information Ratio (IR) is defined as the ratio between (, b) and (, b):
IR =
Excess Return
Excess Return Active Strategy over Benchmark
(, b)
=
=
.
Risk
Tracking Error
(, b)
(2.110)
164
This implies that the efficient frontier is a straight line and then the Sharpe ratio is the
same for all portfolios on the efficient frontier. Therefore, the Sharpe ratio is not useful
to compare the performance of different efficient portfolios and one therefore prefers to
work with the information ratio.
2.8.6
Risk-Based Portfolio Construction
Risk-based portfolio construction is a method that is less discretionary than the method
that imposes constraints in an optimal portfolio model. Risk parity has two basic properties:
1. It is not based on the optimization of an investors utility function, unlike the
Markowitz model.
2. It uses only explicitly the risk dimension of investment.
The first property derives from some of the problems optimal portfolios can have,
problems that we discussed in the last section. The second reflects the difficulty of forecasting expected returns. One may wonder whether risk-based portfolios will then not
always provide the investor with very conservative portfolios, which are acceptable in
their risk but fail to generate any returns? This is not the case since a priori defining a
risk-based program does not mean fixing conservative returns for the portfolios.
Besides risk budgeting, weight budgeting and performance budgeting are well-known
methods in portfolio construction. Weight budgeting - as in the 60/40 portfolios - defines
the weights of the portfolio. Performance budgeting calibrates the weights of a portfolio
to achieve a given performance contribution. The three methods are not independent
of each other. Under some conditions, risk and performance budgeting are equivalent.
Constructing risk-based portfolios is a three step procedure:
Define how risk is measured.
Consider how the risk of a portfolio is split into its components (risk allocation).
Define the risk-budgeting problem.
2.8.6.1
Risk Measurements
The foundations for the discussion of risk measurements - that is, which properties should
a measurement of risk possess - in recent years has been based on the work of Artzner et
al. (1999). They define a set of properties that each risk measure should satisfy, prove
the existence of such measures and show that some widely used measures violate some
of these properties. While this detailed and ongoing debate is beyond the scope of this
chapter, we will nevertheless summarize some of the main properties and findings.
165
The properties or axioms that a coherent risk measurement should satisfy (Artzner
et al. [1999]) are:
1. The risk of two portfolios is smaller than the sum of the risks.
2. The risk of a leveraged portfolio is equal to the leveraged risk of the original portfolio.
3. Adding a cash amount to a portfolio reduces the risk of the portfolio by the cash
amount.
One often adds the following fourth requirement:
4. If a portfolios return dominates another portfolios return in all scenarios, the risk
of the former portfolio dominates the risk of the latter.
Other authors replace some of these axioms by the convexity or diversification property: diversification should not increase risk.
Example - Risk measurements
Value at risk (VaR) is only a coherent risk measure if the portfolio returns distribution
is normally distributed (more generally, elliptically distributed). In general, VaR fails
to satisfy axiom 1. But it is difficult to find real situations where the use of VaR leads
to misleading decisions regarding risk due to its failure to be, generally, a coherent risk
measurement.
Expected shortfall, i.e. what is the expected loss given the loss exceeds a VaR-value,
is a coherent and convex risk measurement. Volatility risk measurements do not satisfy
property 4 (above). But this property is often seen as less meaningful for portfolio
management than for risk management. Therefore, volatility is often used as if it were a
coherent risk measurement. VaR and expected shortfall, contrary to volatility, are both
shortfall measurements - that is, they measure the loss region of a distribution.
To gain intuition for Value at Risk (VaR), which is a dollar amount, we consider:
A stock with an initial price S0 of USD 100.
The price S1 in one year (a random variable).
Investor faces a loss if S1 < 100er with r the risk-free rate.
What is the probability that the loss exceeds USD 10 - that is to say, P (100er S1 <
10) =?
Therefore, the loss amount is given; the probability of the loss is unknown. VaR answers
a related question: the investors search for a USD amount - the VaR - such that the
probability of a loss is not larger than the predefined quantile level. That is to say,
P (100er S1 < ?) 1%,
166
where ? =VaR amount. Hence, the probability of the loss is given; the loss amount is
unknown. The given probability reflects the credit worthiness in risk management of a
bank and the risk tolerance in investment management of an investor.
If we assume that the risk distribution is normal, then essentially all risk measurements such as VaR and expected shortfall are equivalent to volatility risk measurements.
The VaR of a portfolio at a confidence level for a given time horizon reads
VaR(, ) = + k()
(2.111)
with k() the confidence level function for a normal distribution. A similar formula to
(2.111) holds for the expected shortfall.
2.8.6.2
Risk Allocation
The main tool for risk allocation is the Euler allocation principle, see equations (2.47)
and (2.48):
X R() X
R() =
j
=
RCj ()
(2.112)
This risk decomposition holds for the volatility, VaR, and expected shortfall risk measurements.
Example - Euler allocation principle
Consider four assets in a portfolio with equal weights of 25 percent. The volatilities
are 30%, 20%, 40%, and 25%. The correlation structure
1
0.8 1
.
=
0.7 0.9 1
0.6 0.5 0.6 1

The covariance matrix C is then calculated as (using the formula Ckm = km k m )
9%
4%
4%
.
C=
8.4% 7.2% 16%
4.5% 2.5% 6% 6.25%

The portfolio variance
p2 =
4
X
i,j=1
i j Cij = 6.37%.
167
follows. Taking the square root, the portfolio volatility of 25.25% follows. Using (2.48),
the marginal risk contribution vector
26.4%
18.3%
C
0
=
C 37.2%
19%
follows. Multiplying each component of this vector with the portfolio weight gives the
risk contribution vector RC = (6.6%, 4.5%, 9.3%, 4.7%). Adding the components of this
vector gives 25.25% which is equal to the portfolio volatility. This verifies the Euler
formula.
2.8.6.3
Risk Budgeting
We restrict ourselves to the case of two risk budgets; the generalization is obvious. The
main idea is that the portfolio is chosen such that the individual risk contributions, using
a specific risk metrics, equal a predefined risk budget.
Let B1 and B2 be two risk budgets in USD. For a strategy = (1 , 2 ), the risk budgeting problem is defined by the two constraints, which equate the two risk contributions
RC1 and RC2 to the risk budgets - that is to say, the strategy is chosen such that the
following equations hold:
RC1 () = B1 , RC2 () = B2 .
(2.113)
Summing the left-hand sides of (2.113) is, by the Euler principle, equal to total
portfolio risk. The sum on the right-hand side is the total risk budget. Problem (2.113)
is often recast in a relative form. If bk = cBk is the percentage of the sum of total risk
budgets, (2.113) reads
RC1 () = b1 R(), RC2 () = b2 R() .
(2.114)
The goal is to find the strategies which solve (2.113) or (2.114) . This is in general a
complex numerical mathematical problem. But introducing the beta k of asset k,
k =
(C)k
cov(Rk , R())
= 2
2
()
()
implies that the weights are given by

bk 1
k = P k 1 .
j bj j
(2.115)
The weight allocated to component k is thus inversely proportional to the beta. This
equation is only implicit since the beta depends on the portfolio !
168
A special case, which often appears in practice and which also has some interesting
theoretical properties, is the equal risk contributions (ERC) model, in which the weights
for the risk budget bk are set equal to 1/N . Maillard et al. (2008) show that the
volatility of the ERC model is located between the volatility of the minimum variance
(MVP) portfolio and the volatility of an equally capital weighted (EW) portfolio - that
is,
M V P ERC EW .
(2.116)
The three portfolios are defined for all k, j by:
k = j (EW ) ,
()
()
()
()
=
= k
(M V P ), j
(ERC) .
j
k
j
k
(2.117)
Definition 2.8.8. The equal risk contribution approach (ERC) is also called the risk
parity (RP) approach.
Popular risk-weighting solutions include (we follow Teiletche [2015]):
1. The minimum variance portfolio (MVP).8 The risk budgeting policy for this strategy is equal marginal risk contributions.
2. The maximum diversification portfolio (MD). The objective is to maximize the
ratio between undiversified volatility and diversified volatility. The risk budgeting
policy for this strategy is equal volatility-scaled marginal risk contributions. 9
3. The equal risk contribution (ERC). The risk budgeting policy for this strategy is
equal total risk contributions.
4. The equal weight contribution (EW). The risk budgeting policy for this strategy is
equal capital weights.
Solutions 1-4 can be obtained under specific assumptions as mean-variance optimal
portfolio solutions. These assumptions are:
1. Identical excess returns.
2. Identical Sharpe ratios.
3. Identical Sharpe ratios and constant correlation.
4. Identical excess returns and volatilities and constant correlation.
8
This means to minimize p2 = 0 C under the full investment constraint. This implies M V P =
2
1
M
e.
VC
0
9
This means to maximizes Dp = 0 where is a vector of asset volatilities. This equation has
C
the form of a Sharpe ratio, where the asset volatility vector replaces the expected excess returns vector.
The optimal maximum diversification weight vector is then the same as maximum Sharpe ratio portfolio
2
with the volatility vector replacing the expected excess return vector: M D = M D C 1 with the
weighted average asset risk.
169
We mentioned that it is difficult to find a closed-form analytical solution for risk

budgeting problems. But there is a simplified allocation mechanism - inspired by the
allocation (2.117) - which reveals the above four solutions. The heuristic approach is to
choose the allocation
Riskm
k
k = L P
(2.118)
m
Risk
k
k
with Risk any risk measure, L the portfolio leverage which is needed if one defines exante a risk level for the portfolio (risk-targeting approach) and m a positive number.
If m = 0, the portfolio is equally weighted. For increasing m, the portfolio allocation
becomes more and more concentrated on the assets with the lowest individual risk. For
example, the minimum variance portfolio follows if all correlations are set equal to zero
and m = 2, ERC follows by assuming that all correlations are constant and m = 1.
Teiletche (2014) illustrates some properties for above four portfolios using Kenneth
Frenchs US industry indices, 1973-2014; see Figure 2.30.
Figure 2.30: Risk-weighting solutions for EW, MV, MD, and RP (ERC) portfolios using
sector indices from Kenneth French. The variance-covariance matrix is based on five
years of rolling data (Teiletche [2014]).
Figure 2.30 indicates that MV has a preference for lower volatility sectors (e.g., utilities or consumer non-durables), MD prefers low correlation (e.g., utilities or energy),
EW is not sensitive at all to risk measures, and RP (ERC) is mixed. The RP and EW
show similar regular asset allocation patterns and MV and MD asset allocation patterns
are much less regular. The latter react much more to changing economic circumstances
170
and are therefore more defensive.
Example - Different risk-based portfolios

The example is from Deutsche Bank (2012). We are going to explore the efficacy of
five different risk-based portfolio construction techniques. These are: inverse volatility
(IV), equal risk contribution (ERC), alpha-risk parity (ARP), maximum diversification
(MD), and diversified risk parity (DRP).
Inverse volatility (IV) allocates the same volatility budget to each constituent
element of the portfolio. Each style/asset is weighted in inverse proportion to its
volatility.
ERC equalizes the marginal contribution to risk for all assets.
ARP not only considers risk but also return. ARP allocates a risk budget to each
portfolio component in proportion to its alpha forecast.
MD tries to maximize the diversification potential in a portfolio. MD allocates
weights to assets that have low or negative correlation between them. MD should
perform particularly well in the case of a portfolio with uncorrelated underlying assets.
DRP tries to find the uncorrelated bets in a portfolio by applying principal component analysis (PCA). Diagonalize the covariance matrix leads to the strategy vector
W where W 0 CW = . Define a new portfolio i proportional to (W )2i i , properly
normalized. The number of uncorrelated bets is then a function of the Shannon entropy
defined on the strategy .
The back-test of the five methods starts in 1998 and ends in 2012 (see Figure 2.31).
There are two portfolio constructions. The first portfolio consists of four asset classes
(equities, fixed income, commodities, and FX). The second portfolio is constructed
by using the risk factors market beta and value, carry, and momentum cross asset classes.
ARP consistently outperforms ERC on a risk-adjusted basis for estimation windows
longer than one month. The best-performing strategy in risk-adjusted (information
ratio) terms is MD over an expanding window length. This highlights the importance of
taking correlations into account, especially for such a portfolio where almost half of the
pair-wise correlations between the underlying assets are negative (at least over the long
term). The correlation structure of the risk factors naturally suits the MD weighting
scheme. With an annual return of 4.2 percent and a volatility of 1.9 percent per annum,
MD has an information ratio of 2.25. It also follows that adding risk factors to the asset
allocation mix improves the risk-adjusted performance of the portfolio irrespective of the
171
allocation mechanism chosen, whilst at the same time registering strong improvements
on drawdown, VAR, and expected shortfall risk measures.
Applying the same risk-based allocation methodologies to asset classes, DRP - which
maximizes the number of independent bets - achieves the highest return, lowest volatility, and lowest maximum drawdown. Indeed DRP is targeted at extracting uncorrelated risk sources - the principal components - from multiple asset classes, rendering the
methodology suitable for a portfolio the underlying components of which share a big
portion of similar risk. In the case of style portfolios that are relatively uncorrelated,
DRP becomes inferior to the MD methodology.
Figure 2.31: Profile of risk-based portfolio allocations - factors vs asset classes. The
portfolios are inverse volatility (IV), equal risk contribution (ERC), alpha-risk parity
(ARP), maximum diversification (MD), and diversified risk parity (DRP). CAGR is the
compound annual growth rate, AnnVol the annualized volatility, IR the internal rate of
return, MaxDD the maximum drawdown, VaR95 the value-at-risk on the 95% confidence
level, and ES95 the expected shortfall on the 95% confidence VaR level (Deutsche Bank
[2012]).
Example - ERC vs. 1/N vs. MVP
172
Maillard et al. (2009) compare the ERC portfolio with 1/N and MVP portfolio for
a representative set of the major asset classes with data from Jan 1995 to Dec 2008.
The asset class representatives are: S&P 500, Russell 2000, DJ Euro Stoxx 50, FTSE
100, Topix, MSCI Latin America, MSCI Emerging Markets Europe, MSCI AC Asia ex
Japan, JP Morgan Global Govt Bond Euro, JP Morgan Govt Bond US, ML US High
Yield Master II, JP Morgan EMBI Diversified), S&P GSCI.
The ERC portfolio has the best Sharpe ratio and average returns. The Sharpe ratio of
the 1/N portfolio (0.27) is largely dominated by MVP (0.49) and ERC (0.67). MVP and
ERC differ in their balance between risk and concentration. The ERC portfolios are much
less concentrated than their MVP counterparts and also their turnover is much lower.
Lack of diversification in the MVP portfolios can be seen by comparing the maximum
drawdown values: The value for MVP is 45% compared to 22% of the ERC portfolio.
When we restrict the risk measurement to volatilities, the heuristic approach (2.118)
takes the following generic form (Jurczenko and Teiletche [2015]):
= k 1 ,
(2.119)
where k is a positive constant, is a vector of volatilities of N assets, and is the vector

of risk-based portfolio weights. The equation is meant to hold for each component of the
vectors. Therefore, higher-volatility assets are then given lower weights, and vice versa.
Equation (2.119) corresponds to the risk parity and maximum diversification portfolio
solutions when the correlation among assets is constant, the minimum variance portfolio when correlation is zero, and the 1/N portfolio when all volatilities are equal. Many
practitioners use (2.119) to scale their individual exposures and the MSCI Risk Weighted
Indices attribute the weights proportionally to the inverse of the stock variances.
The constant k can be calibrated in different ways. If we use a capital-budgeting
constraint - that is to say, the sum of the components i is equal to 1, implies
1
1 .
k k
k=P
So, (2.119) becomes the heuristic model (2.118) with m = 1 and zero leverage. If we use
a volatility-target constraint T for the risk-based portfolio, we get
k=
T
T
=
N Concentration
N C()
(2.120)
with the average pair-wise correlation coefficient of the assets and C() the concentration measure10
p
C() = N 1 (1 + (N 1)) .
(2.121)
10
To prove this formula, we write for the diagonal matrix with the vector of volatilities on its
diagonal, the correlation matrix of returns and I the identity matrix. The covariance matrix can be
2.9. FACTOR INVESTING
173
The concentration measure varies from 0, when the average pair-wise correlation reaches
its lowest value, to +1, when the average correlation is +1. Hence, k increases when the
diversification benefits are important - that is, when the correlation measure decreases.
In this case, each constituents weight needs to be increased to reach the desired volatility
target: the risk-based portfolio even becomes leveraged.
2.9
Factor Investing
We consider in more detail the CAPM, the Fama-French (FF) three-factor and five-factor
models and best-in-class factor investment offering.
2.9.1
The CAPM
The linear relation in the CAPM between the excess return of an asset and the market
excess return follows from the following assumptions:
Investors act competitively, optimal, have a one-period investment horizon and
there are many investors with small individual endowments. Hence, they cannot
influence prices and are so-called price takers.
All investors have mean-variance preferences.
All investors have the same beliefs about the future security values.
Investors can borrow and lend at the risk-free rate, short any asset, and hold any
fraction of an asset.
There is a risk-free asset in zero net supply. Since markets clear in equilibrium,
total supply has to equal total demand. Given the net supply of the risk-free asset,
we combine the investors portfolios to get a market portfolio. This will imply that
the optimal risky portfolio for each investor is the same.
All information is accessible to all investors at the same time to all investors - there
is no insider information.
written in the form C = which implies
h 1 , 1 i = he, ei .
The volatility of the risk-based portfolio is then given by (using (2.119)):
s
XX
p
p
RB = C = k he, ei = k 1 +
ij .
i
Introducing the average pairwise correlation coefficient

=
XX
1
ij
N (N 1) i
j6=i
implies (2.120).
j6=i
174
Markets are perfect: There are no frictions such as transaction costs or lending or
borrowing costs, no taxes, etc.
Proposition 2.9.1. Under the above assumption:

Each investor is investing in the risk-less asset and the tangency portfolio.
The tangency portfolio is the market portfolio.
All investors hold the same portfolio of risky securities.
For each title i, we have a linear relationship between risk and return (the security
market line [SML]),
E(Ri ) Rf = i,M (E(RM ) Rf )
with the beta i,M =
portfolio M .
cov(Ri ,RM )
2
M
(2.122)
measuring the risk between asset i and the market
The SML implies:

Risk/reward relation is linear.
Beta is the correct measure of risk, that is, beta measures how risk is rewarded
in the CAPM. Beta is a measure of non-diversifiable or systematic risk. There is
no measure of individual security risk entering the SML. Investors only care about
the beta with respect to the market portfolio. If asset i is uncorrelated with the
market, its beta is zero although the volatility of the asset may be arbitrarily large.
Therefore, any asset that appreciates when the market goes up and loses value
when the market goes down, is risky and has to earn more than the risk-free rate.
There is no reward, via a high expected rate of return, for taking on risk that can
be diversified away. A higher beta value does not imply a higher variance, but
rather a higher expected return.
= 1 implies E(Ri ) = E(RM ), = 0 implies E(Ri ) = Rf and < 0 implies
E(Ri ) < Rf .
The SML is an expression for the rate of return, opportunity cost of capital and
the risk-adjusted discount rate (see Examples below).
Given all the assumptions, all investors desire to hold the same risky assets. Suppose
that they all want to invest 1% in ABB stock. Then, ABB will also comprise 1% of the
market portfolio. Hence, all investors hold the market portfolio. The model contains the
price equilibrating process. Suppose that a stock X is not part of individually preferred
portfolio. Then its demand is zero and the price of X will fall until to a level where
X becomes more attractive and included in the investors portfolios. But this then also
adjusts the weights of all other assets. Hence, all assets have to be included in the market
175
portfolio.
It follows from (2.122) that the portfolio beta is the weighted sum of asset betas multiplied by the portfolio weights due to the linearity of the SML. The beta of a (40/60)
portfolio with betas of 0.8 and 1 is then 0.92. Compared to the Markowitz model with N
assets, where one needs to estimate 2N + N (N 1)/2 parameters, the number is 3N + 2
for the CAPM.
Investors aversion to risk is different in recession and booming periods. They prefer
to limit their risk exposure in recessions and to increase it during booms. They require
a higher return for bearing risk in recession periods. But the CAPM is a one-period
model, in which such preferences cannot exist. A time-conditional CAPM allows for
such extended preferences, see below.
2.9.1.1
CML and SML
Inserting cov(Ri , RM ) = (i, M )k M in (2.122) implies

SRk :=
k Rf
M Rf
= (k, M )
k
M
(2.123)
The Sharpe ratio of asset k is equal to the slope of the CML times the correlation
coefficient. Comparing the SML with the CML, see Figure 2.32, it follows that in a
CAPM all portfolios lie on the SML but only efficient portfolios lie on the CML. A
portfolio lies on the SML and CML if the correlation between the portfolio return and
the market portfolio is 1. If the correlation is smaller than 1, the portfolio lies on the
SML but not on the CML. Finally, SML plots rewards vs systematic risk while CML
plots rewards vs total risk (systematic + unsystematic).
Example
Consider three risky assets A, B, and C and 3 investors with capital of 250, 300, and 500,
respectively, who have the following portfolios:
Investor
1
2
3
Market Cap. 1,050
Risk-less asset
50
-150
100
0
Table 2.19: CAPM.
A
50
150
75
275
B
50
200
75
325
C
100
100
250
450
176
Figure 2.32: Left panel - capital market line in the Markowitz model. Right panel security market line in the CAPM model.
Market capitalization is then 1, 050, the tangency portfolio follows from the
Markowitz model T = (0.2619, 0.3095, 0.4286) and the market portfolio is M =
(275/1050, 325/1050, 450/1050). It follows that the two portfolio are equal.
Example [Kwok (2010)]

Consider three risky assets, the market portfolio, and a risk-free asset given by the
following data:
Portfolio
1
2
3
Market portfolio
Risk-free asset
10%
20%
20%
20%
0%
with market portfolio

1
0.9
0.5
1
0
Table 2.20: Asset pricing in the CAPM.
0.5
0.9
0.5
1
0
13%
15.4%
13%
16%
10%
177
The CML implies, at the standard deviation levels 10 percent and 20 percent, respectively, expected returns of 13 percent and 16 percent. Therefore portfolio 1 is efficient,
but the other two portfolios are not. Portfolio 1 is perfectly correlated with the market
portfolio but the other two portfolio have non-zero idiosyncratic risk. Since portfolio 2
has a correlation closer to one it lies closer to the CML. The expected rates of return
of the portfolios for the given values of beta, calculated with the SML, agree with the
expected returns in the table. To see this,
= 0 + (M 0 ) = 13%.
Therefore, there is no mis-pricing.
2.9.1.2
Systematic and Idiosyncratic Risk and Tracking Error
The following assumption or relations hold for the regression of asset k in the empirical
CAPM equation: E(k ) = cov(k , RM ) = 0 and
2
2
M
+ var(k )
k2 = k,M
which is a decomposition in systematic and idiosyncratic risk. The non-systematic risk

is not correlated with the market and can be reduced by diversification.
Examples - Systematic and idiosyncratic risk; betas and tracking error

Consider two stocks:
Stock 1: Chemical sector, market beta 1.5 and residual variance of 0.1.
Stock 2: Software sector, market beta 0.5 and residual variance of 0.18.
The total risk of the two assets is, for a market standard deviation of 20 percent,
2
2
12 = 1,M
M
+ var(1 ) = (1.5)2 (0.2)2 + 0.1 = 0.19
2
2
22 = 2,M
M
+ var(2 ) = (0.5)2 (0.2)2 + 0.18 = 0.19 .
The two stocks have the same total risk but very different systematic risk: the percentage
of systematic risk for the first stock is (1.5)2(0.2)2/0.19 = 47% but for the second stock
the figure is 5%.
Consider the return equation for a portfolio return R, the alpha, and the random
variables k :
R = + RM + k .
178
For the tracking error (TE), the volatility of the return difference R RM , we get
q
2 + 2.
TE = ( 1)M

The TE is minimal for = 1. Then the only risk difference between the market portfolio
risk and the investor portfolio is residual risk.
2.9.1.3
Performance Measurement
The Sharpe ratio is still the standard measure for risk-adjusted returns. A motivation for
the Sharpe ratio dates back to the safety-first principle of Roy (1957). Roy argued that
an investor first wants to make sure that a certain amount of the investment is preserved
before he or she thinks about the optimization of risk and return. That is, capital protection is the first motivation for an investor. The investor therefore searches for a strategy
such that the probability of the invested return being smaller than a level r0 is minimized.
But this probability cannot be larger than 2 /( r0 )2 independent of the chosen
probability (this follows from Tchebychevs inequality of probability theory). Therefore,
if we do not know the probability function of returns, the best thing to do is to minimize
/( r0 ), which is the same as maximizing the Sharpe ratio.
If portfolios are diversified, the Sharpe ratio is an appropriate risk-adjusted measure.
Which measure should one choose if the portfolio is less diversified? Jensens alpha, the
appraisal ratio, and the Treynor measurement are such measurements. These measurements ask how well would the asset have done relative to a portfolio of the market and
risk-free asset with the same systematic risk. They are based on the SML while the
Sharpe ratio is based on the CML.
Example - Performance Measurement

Jensens alpha
k := k Rf k (M Rf )
(2.124)
is a performance measurement between the realized and theoretical returns of the

CAPM. Since alpha is a return it should be used for the compensation of portfolio
managers. While the Sharpe ratio can be illustrated in the return-volatility space,
Jensens alpha is shown in the return-beta space. Jensens alpha measures how far
above the SML the assets performance is. The Jensen measurement does not consider
the systematic risk that an investment took on earning alpha.
179
The Treynor measurement (TR) adjusts for this systematic risk taken :
TRk := (k Rf )/k .
The TR equals the slope of the SML for the actively managed portfolio. If the markets
are in equilibrium, the CAPM holds, then the Treynor ratio is the same for all securities.
The Jensen and Treynor measurements do not adjust for idiosyncratic risk in the
portfolio.
The appraisal ratio (AR) or information ratio (IR) divides the excess return over the
benchmark by the tracking error (TE).
Values of the IR around 0.5 are considered to be good values while a value greater
than 1 is extraordinary. The IR generalizes the Sharpe ratio since the it substitutes the
passive benchmarks for the risk-free rate. We calculate the different ratios for the data
in 2.23.
Portfolios
A
B
C
Market
Risk-free rate
Return
12%
16%
18%
15%
4%
Volatility
15%
24%
17%
20%
-
Correlation with market

0.9
0.94
0.98
-
Table 2.21: Data set for the performance ratios.
The beta of A is equal to its market portfolio correlation times its volatility divided
by the market volatility - that is, 0.9 15%/20% = 0.675. The Sharpe ratio for A is
(12% 4%)/15% = 0.53. Jensens alpha for portfolio A reads 12% 4% 0.675(15%
4%) = 0.575% and the Treynor ratio for A is given by (12% 4%)/0.675 = 0.119. The
IR and the TE follow in the same way. We finally get:
Portfolio
A
B
C
Market
Beta
0.675
1.128
0.833
1
TE
9.22%
8.58%
4.75%
0%
Sharpe
0.53
0.5
0.84
0.55
Jensen
0.58%
-0.41%
4.84%
0%
Table 2.22: CAPM.
Treynor
0.119
0.106
0.168
0.11
IR
0.062
-0.048
1.017
-
180
It follows that portfolio C is the best portfolio. We summarize the relevance of the
different performance measurements:
Beta is relevant if the individual risk contribution of a security to the portfolio risk
is considered.
TE is relevant for risk budgeting issues and risk control of the portfolio manager
relative to a benchmark.
The Sharpe ratio is relevant if return compensation relative to total portfolio risk
is considered.
Jensens alpha is the maximum amount one should pay an active manager.
Treynor measurement should be used when one adds an actively managed portfolio,
besides the many yet existing actively managed one, to a passive portfolio.
The information ratio measures the risk-adjusted return in active management.
It is frequently used by investors to set portfolio constraints or objectives for their
managers, such as tracking risk limits or attaining a minimum information ratio.
Grinold and Kahn (2000).
Warnings: If return distributions are not normal - they show fatter tails, higher peaks,
or skewness - use of these ratios can be problematic, since higher moments than the
second (variance) contribute to the risk. Furthermore, the IR depends on the chosen
time period and benchmark index. Finally, the chosen benchmark index affects all ratios
which use benchmarks: Managers benchmarked against the S&P 500 Index had lower IR
than managers benchmarked against the Russell 1000 Index [Goodwin (2009)].
2.9.1.4
Empirical Failure of the CAPM
There are many assumptions in the CAPM. Some of them are very strong. They are the
cause of the empirical failure of the CAPM.
The CAPM can, for example, not explain the size or value effect. The CAPM on
average explains only 80 percent of portfolio returns. One needs more factors than just
the covariance between the asset return and the return on the market portfolio. This led
to the factor models initiated by Fama and French - see below - which explain 90 percent
of portfolio returns.
The CAPM also attracts a lot of criticism from a behavioral finance point of view.
The assumption that the beliefs (probability distribution) of all investors match the
true distribution of returns is very strong. Behaviorists consider instead models where
investors expectations deviate from the true return distribution. This causes market
181
prices to be informational inefficient.

Finally, the market portfolio is unobservable since it include all types of assets that
are held by anyone as an investment. Beside standard financial assets, illiquid ones such
as real estate or art matter. Using broad indices as proxis for the unobservable market
portfolio can lead to false inferences as to the validity of the CAPM [Roll (1977)].
The time series regression equation
Rt,k Rt,f = k + k,M (RM,t Rt,f ) + t .
is used to estimate the betas. The estimates of beta are often volatile both for stocks
and for sectors; see Figure 2.33. Then the individual returns are regressed on these betas
Figure 2.33: Beta estimates for AT&T (left panel) and the oil industry (right panel)
(Papanikolaou [2005]).
Rt,k = k + k,M
and one tests whether the regression residuals k are zero.
Key findings are that excess returns on high-beta stocks are low, that excess returns
are high for small stocks and that value stocks have high returns despite low betas while
momentum stocks have high returns and low betas.
182
Fama and French (1992) provide evidence that the CAPM does not account for
returns of size and book-to-market (B/M) sorted portfolios. The CAPM does not explain
why in the past firms with high B/M ratios outperformed firms with low B/M ratios
(value premium), or why stocks with high returns during the previous year continue to
outperform those with low past returns (momentum premium).
2.9.1.5
Conditional CAPM
Some researchers assumed that the poor empirical performance of the CAPM could be
due to its assumption of constant conditional moments. Hansen and Richard summarized
that the CAPM could hold conditionally at each point in time, but fail unconditionally.
Some authors therefore model explicitly the time varying conditional distribution of returns as a function of lagged state variables.
Lewellen and Nagel (2006) did not questioned the fact that betas vary considerably
over time. But they provide evidence that betas do not vary enough over time to explain
large unconditional pricing errors. As a result, the performance of the conditional CAPM
is similarly poor as the unconditional model: It is unlikely that the conditional CAPM
can explain asset-pricing characteristics like book-to-market and momentum. These statistical criticisms are not unique to the CAPM. Most asset pricing models are rejected
in tests with power.
Despite the aforementioned problems, the CAPM is used for figuring out the appropriate compensation for risk, is used as a benchmark model for other models, and is
elegantly simple and intuitive.
The conditional CAPM works as follows. Consider two stocks. Suppose that the times
of recessions and expansions are not of equal length in an economy, that the market
risk premia are different and that the two stocks have different betas in the different
periods. The CAPM then observes only the average beta for each stock for both periods.
Assume that this beta is 1 for both stocks. Therefore, the CAPM will predict the same
excess return for the two stocks. But in reality the two stocks will show due to their
heterogeneity different returns for the two different economic periods. One stock can for
example earn higher return than explained by the CAPM since its risk exposure increases
in recessions, when bearing risk is painful, and decreases in expansions. Therefore such a
stock is riskier than the CAPM suggests and the CAPM would detect an abnormal high
return suggesting this is a good investment. The conditional CAPM corrects this since
return comes from bearing the extra risk of undesirable beta changes. See the exercises
for numerical examples.
2.9.2
2.9.2.1
183
Fama-French 3- and 5-Factor Models

3-Factor Model
The three-factor model (2.74) is an empirical asset pricing model with the factors market
portfolio, SMB, and HML. The model is successful on the characteristics associated with
size and various price ratios, but it fails to absorb other characteristics such as short-term
momentum returns. The three factor model is routinely included among the alternatives
in empirical research.
Example - Fama-French (FF) Factor Construction

The construction of the FF factors reads, in more detail (taken from Kenneth Frenchs
web site), as follows. The factors are constructed using the six value-weight portfolios
formed on size and book-to-market.
SMB (small minus big) is the average return on the three small portfolios minus
the average return on the three big portfolios
SMB =
1
(Small Value + Small Neutral + Small Growth)
3
1
(Big Value + Big Neutral + Big Growth) .
3
(2.125)
HML (high minus low) is the average return on the two value portfolios minus the
average return on the two growth portfolios
HML =
1
1
(Small Value + Big Value) (Small Growth + Big Growth) .
2
2
Whether a stock belongs to, say, Small Value depends on its ranking. Small Value
contains all stocks where the market value of the stock is smaller than the median
market value, say, of the NYSE and where the book-to-market ratio is smaller than
the 30 percent percentile book-to-market ratio of NYSE stocks.
SMB for July of year t to June of t + 1 includes all NYSE, AMEX, and NASDAQ
stocks for which there exist market equity data for December of t 1 and June of
t, and (positive) book equity data for t 1.
Example - Why including factors which cannot explain average returns?

We follow Cochrane (2010). Individual stocks have higher volatility than portfolios.
This makes is difficult to accurately measure the expected return of the stocks and also
184
the measurement of the betas is difficult and time varying. One therefore considers
portfolios with certain characteristics that academics can test. This is in line with many
investors which also group their portfolio with certain characteristics which they think
will outperform. The CAPM for example worked until stocks were grouped by their
book-to-market ratio (value) but it still works when stocks are grouped according to
their size. But why FF include size given that size portfolios are perfectly explained
by the market beta? If FF were only to consider factors which explain the average
returns then they could left them out. But size is important as a model of return
variance reduction.
To see this work, assume that the CAPM is perfect. Then the price of stock k reads
E(Rk ) = k E(RM )
where we assume for simplicity that the risk free rate is zero. To run the CAPM we
include an additional industry portfolio in the regression, i.e.
Rt,k = k + k,M Rt,M + k,I Rt,I + t,k .
The regression will generically lead to a coefficient k,I 6= 0. Taking expectations, we get
E(Rt,k ) = k + k,M E(Rt,M ) + k,I E(Rt,I ) .
Hence the industry portfolio has a positive mean which puts us into troubles since we
assumed that the CAPM is a perfect model. To resolve the puzzle, one uses geometry.
This means, one includes a orthogonalized portfolio or beta-hedged industry portfolio.
We first run a regression of the industry portfolio on the market portfolio:
Rt,I = I + I,M Rt,M + t,I .
If the CAPM is right, then the industry alpha is zero and we get
E(Rt,I ) = I,M E(Rt,M ) .
Orthogonalizing means to substract from a return its mean:
Rt,I
:= Rt,I E(Rt,I ) = Rt,I I,M Rt,M .
This is equivalent to beta-hedge the portfolio. The expected value of the new return is
then zero if the CAPM is right. Then, one runs again a regression of the orthogonalized
industry portfolio on the market return. This improves the R2 and the t-statistics and
the volatility of the residuals decreases while the mean of the CAPM is unchanged.
Considering different portfolios, the R2 statistics increased for different portfolios
from 78 percent using the CAPM to 93 percent in the FF portfolios. Roncalli (2013)
185
states that the improvement in the R2 is not uniform:

The difference in R2 between the FF and the CAPM is between 18 percent and 23
percent in the period 1995-1999.
This difference is around 30 percent during 2000 and 2004.
The difference then decreases and is around 11 percent during the GFC.
In the period starting after the GFC and running until 2013 the difference is 7
percent.
Are the FF factors global or country specific? Griffin (2002) concludes that the FF model
exhibits its best performance on a country-specific basis. This view is largely accepted
in the industry and by academics. FF performed originally regressions on portfolios of
stocks. Huij and Verbeek (2009) and Cazalet and Roncalli (2014) provide evidence that
mutual fund returns are more reliable for example than stock returns since academic
factors on stock portfolios do not consider frictions (transaction costs, trade impact, and
trading restrictions).
The interpretation of the FF model conflicts the view of analysts. If returns increase
with B/P, then stocks with a high B/P ratio must be more risky than average. This is the
opposite story a business analyst would tell. The difference is due to the efficient market
hypothesis (EMH) (see Chapter 3). The analyst does not believe in it. Therefore, for him
a high B/P ratio indicates a buying opportunity since the stock is cheap. If an investor
believes in the EMH then cheap stocks are cheap for the single reason that they are risky.
There is sometimes confusion between the cross-section and the regression of time
series. Consider the FF cross section with a high R2 , i.e. a low alpha: FF explains well
the cross-section of average returns. But the R2 of the time series can be low: FF then
fails to explain the cross-section of ex-post returns. The opposite case with a high R2 for
the time-series and high alpha is also possible. The main objective of the FF regression
is to see whether alpha is low and not to explain stock returns well. Put it different, the
goal is to see if average returns are high where betas are high, not whether the time-series
regressions do well.
Figure 2.34 illustrates the different FF factors performance since 1991. The size
factor only generates low returns compared to the other factors. This is the reason why
most risk premia providers do not offer the size risk premia. The momentum factor on
a stand-alone basis outperformed the market. But the chart also shows that the momentum factor can lead to heavy losses - the momentum crash. The right panel shows
the distribution of the monthly returns of the momentum risk factor since 1927. Heavy
monthly losses occurred during the Great Depression. There, the risk factor faced losses
of up to 50 percent in one month. The risk factor performed much better in the post
WWII period until the burst of the dot-com bubble. In this period, investing USD 100,
186
say, in 1945 led to a payback of USD 3,500 around 50 years later. The average monthly
return over the whole period is 0.67 percent.
The cyclicality of this risk factor is common to all risk factors. Factor indexes show
persistent excess risk-adjusted returns over long time periods but over shorter horizons
they show cyclical behavior where they can underperform in some periods. Authors like
Ang (2013) argue that the premia exist to reward long-horizon investors for bearing that
risk.
250
18.00%
Monthly returns of momentum risk factor,

1927 2014
200
16.00%
14.00%
150
12.00%
100
10.00%
8.00%
50
6.00%
4.00%
2.00%
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
-50
Mkt-RF
SMB
HML
WML
0.00%
RF
Figure 2.34: Left panel - FF annual factor performance in the period 19912014
starting each year in January and ending in December. Mkt is the market return, RF
the risk-free return, and WML the momentum factor. Right panel - monthly returns of
the momentum risk factor (Kenneth Frenchs web site).
Given the cyclicality of the risk factors, investors ask How long will the factors excess
return persist? First, the driving force must present itself on the risk level as a systematic
risk source otherwise there would be no persistent risk premium. This description of the
term systematic risk as non-diversifiable risk is at the center of the rational economic
explanation of the existence of risk factors. A second, different, approach comes from
the systematic error view of behavioral economics. Investors exhibit behavioral biases
due to cognitive or emotional limitations, leading to strategies in which winners are for
example chased. As a result, for each factor typically there is more than one systematic
risk and error-based theory that explains why the factor should be persistent.
187
Example Facts and pitfalls in FF model

The definition of growth stocks in FF is different from the usual definition of growth
stocks in the financial industry. In the latter one, that such stocks have fast-growing
earnings for example. In the FF model, a growth stock has a high market/book ratio,
this means they are overpriced.
Given the factors size and B/M ratio in the FF model one could assume that one
can explain the average return of a firm by considering the value of the two factors. But
the FF model as the CAPM state that high average returns follow by covariation with
say the B/M portfolio and not due to high B/M values of the firm.
Momentum is a big problem for the FF model. First, the returns of value and
momentum are highly correlated. But the correlation goes the wrong way. FF thought
at some time that it is data snooping and that it will go away. But so far, it did not.
2.9.2.2
5-Factor Model
Fama and French (2015) proposed a five-factor model extension of their three-factor
model. The motivation of the model follows from the following firm valuation equation:
Mt
=
Bt
j=1
Et
1
(Yt+j
(1+R)j
Bt+j )

(2.126)
Bt
with M the current market cap, Y total equity earning , B the change in total book
value in the period, and R the internal rate of return of the expected dividends. Equation
(2.126) follows from the fundamental pricing equation; see Equation (3.15) in Chapter 3.
Equation (2.126) implies that B/M value is an imperfect proxy for expected returns: The
market cap M also responds to forecasts of earnings and investment (expected growth
in book value) which define the two new factors. The regression (2.74) reads (neglecting
time indices)
Ri Rf = i,M (RM Rf ) +
i,k Rk + i + i ,
(2.127)
k{SMB, HML, RMW, CMA}
with RRM W the earnings risk factor (difference between robust and weak profitability)
and RCM A the innovation risk factor (difference between low- and high-investment firms).
The explicit construction of the risk factors is a long/short combination similar to (37);
see Fama and French (2015).
Fama and French (2015) first analyze the factor pattern in average returns following
the construction of the three-factor model:
188
One-month excess return on the one-month US treasury bill rate follow.
The returns follow for 25 value-weighted portfolios of US stocks from independent
sorts of stock (five size and five B/M groups ranking into quintiles). The authors
label the quintiles from Small to Big (Size) and Low to High (B/M).
Data are from 1963 to 2013.
Figure 2.35: Return estimates for the 5x5 size and B/M sorts. Size is shown on the
vertical and B/M on the horizontal. OP are the earnings factor portfolios and Inv the
investment factor portfolios. Returns are calculated on a monthly basis in excess to the
one-month US treasury bill rate returns. Data start in July 1963 and end in December
2013, thus covering 606 months (Fama and French, 2015).
Panel A in Figure 2.35 shows that average returns typically fall from small to big
stocks - the size effect. There is only one outlier - the low portfolio. In every row, the
average return increases with B/M - the value effect. It also follows that the value effect is stronger among small stocks. In Panel B, the sort B/M is replaced by operating
profitability due to the definition found in Fama and Frenchs 2015 paper. Patterns are
similar to the size-B/M sort in panel A. For every size quintile, extremely high rather
than extremely low operating profitability (OP) is associated with a higher average return. In panel C the average return on the portfolio in the lowest investment quintile is
dominates the return in the highest quintile. Furthermore, the size effect exists in the
lowest four quintiles of the investment factor.
189
The authors perform an analysis to isolate the effect of the factors on average return.
The main results are:
Persistent average return patterns exist for the factors HML, CMA, RMW, SMB.
As expected, statistical tests reject a five-factor model constructued to capture
these patterns.
The model explains between 71 percent and 94 percent of the cross-section variance
of expected returns for HML, CMA, RMW, SMB.
HML (value) becomes a redundant factor. Its high average return can be completely
generated by the other four factors, in particular to RMW and CMA.
Small stock portfolios with negative exposure to RMW and CMA are problematic:
Negative CMA exposures are in line with evidence that small firms invest a lot.
Negative exposures to RMW, contrary, is not in line with a low profitability.
Why Fama and French did not introduce the factor momentum, despite the overwhelming evidence that it contributes to explaining returns and is itself not captured by
the other five factors? Asness et al. (2015) state that momentum and value are best
viewed together, as a system, and not stand-alone. Therefore, it is not a surprise to
the authors that value becomes redundant in the five-factor model where momentum
is not considered. They then redo the estimation of Fama and French where they also
find that - without momentum - HML can be reconstructed and is better explained by
a combination of RMW and CMA. But the reverse is not true, that is, CMA cannot
be explained for example by HML and RMW. The authors then add momentum which
is negatively correlated to value. Transforming finally how value is constructed in the
Fama and French paper, value becomes statistical significant in explaining returns.
2.9.3
Factor Investment - Industry Approach
We consider the practice of factor offering by large asset managers. The data in this
section are all from Deutsche Bank (DB) or JP Morgan (JPM). The process of building
a risk factor portfolio is as follows (Deutsche Bank [2015]):
Identify the key objectives of the portfolio and the preferences of the investor.
Start with a long list of potential available risk factors.
Select a core portfolio made up of the most attractive carry, value, momentum,
and volatility strategies, diversifying selections across asset classes and taking into
account the key objectives and criteria of the investor.
Add any idiosyncratic factors unique to a single asset class if they are attractive
on a stand-alone basis and offer a benefit to the portfolio.
190
Finalize the short list of selected risk factors and construct a portfolio using a
simple risk-parity methodology.
The portfolio is then reviewed and tested against general measures of diversification.
Figure 2.36, upper panel, shows the cross asset risk factor list of DB.
EQ
Equities
Category
Carry
EQ Dividends
EQ Merger Arb
Value
EQ Value
Volatility
EQ Glob Vol Carry
IR
Interest Rates
CR
Credit
FX
Currencies
IR Carry Diversified
IR Muni/Libor
CR Carry HY vs. IG
IR Vol
EQ Mean Reversion
Momentum
EQ Moment.
Idiosyncratic
EQ Low Beta
EQ Quality
IR Moment.
CR Moment.
CO
Commodities
FX Global Carry
CO Carry (Curve)
FX Value
CO Value
FX Vol Basket
CO Vol Divers.
FX Vol Single
CO Vol Single
FX Moment.
CO Trend
CO Momentum
16.0%
1.8
14.0%
1.6
1.2
10.0%
8.0%
0.8
6.0%
0.6
4.0%
Sharpe Ratio
1.4
12.0%
0.4
COM Vol
COM Carry (Curve)
EQ Carry (Div)
COM Mome (Trend)
IR Mome
COM Carry (Box)
CR Carry
EQ Idios. (Low Beta)
IR Vol
EQ Value
CR Mome
EQ Idios.(Quality)
COM Value (Backw.)
FX Carry (Balanced)
EQ Vol
EQ Carry (Merg Arb)
IR Carry (Divers.)
IR Carry (Muni/Libor)
FX Value
COM Mome
FX Carry (Global)
FX Mome
0.2
0.0%
FX Carry (G10)
2.0%
EQ Mome
Return & Volatilitt
Annual Returns (dark), Volatilities (light) and Sharpe Ratios (diamonds)

For DB-Factors since Start Date
Figure 2.36: Upper Panel: Risk factor list of DB London. Risk factors are grouped
according to their asset class base and the five styles used by practitioners. Lower Panel:
Average annualized volatilities, returns and Sharpe ratios for the risk factors (DB [2015]).
The lower panel shows that the risk and return properties of the different risk factor
differ. Therefore, if one invests into a portfolio with a target volatility to control downside risk leverage is needed simply because else combining a low vol 2% interest rate risk
premia with a 12% vol equity premia makes no sense.
Given this list of factors, Figure 2.37 shows monthly correlations. The lower triangular
matrix shows correlations calculated for turbulent markets; those for normal markets are
shown in the upper triangular matrix. The following periods define turbulent markets:
May 97 to Feb 98 Asian financial crisis
Jul. 98 to Sept. 98 Russian default and collapse of LTCM
191
Mar. 00 to Mar. 01 Dot-com bubble bursts

Sept. 01 to Feb. 03 9/11 and market downturn of 2002
Sept. 08 to Mar. 09 US subprime crisis and collapse of Lehman Bros.
May 10 to Sept. 10 European sovereign debt crisis
Correlations
ARP
ARP
Equities
Bonds
Rohstoffe
Hedge
Funds
Real Estate
Private
Equity
5%(*) /
4%(*)
10%
4%
7%
16%
8%
9%
6%
39%
47%
64%
77%
18%
5%
7%
-20%
30%
24%
28%
29%
40%
Equities
4%
Bonds
6%
4%
Rohstoffe
6%
43%
19%
Hedge Funds
13%
41%
11%
32%
Real Estate
4%
66%
9%
34%
28%
Private Equity
4%
78%
-21%
36%
35%
52%
52%
Figure 2.37: The correlations in the top-left cell is the average equally-weighted correlation of a portfolio of all DB risk premia. In the lower triangular matrix the correlations
are calculated for turbulent markets; those for normal markets appear in the upper triangular matrix (DB [2015]).
The correlation for the equally weighted portfolio of risk factors implies an annualized
correlation of 4% in normal markets and 5% in stressed ones. The correlation is also low
to the traditional asset classes while the annualized correlations between the different
asset classes is much larger.
Of particular importance are so-called low beta portfolios - that is to say, a portfolio
of risk factors should have low correlation to equities and bonds in all normal periods and
negative correlation to equity in turbulent markets. Suitable risk factors chosen from the
long list are the value factors for all asset classes, momentum risk factors for all asset
192
classes, low beta risk factor, quality, and US muni curves vs Libor. The correlation of
this portfolio to equity is 1.6 percent and to bonds 7.6 percent. In turbulent markets,
correlation to equity is -37.5 percent and to bonds 8.8 percent. The Sharpe ratio is very
high and the maximum drawdown is low, at 5.6 percent. Table 2.37 shows the summary
statistics.
Statistics
% positive 12m returns
IRR
Volatility
Sharpe ratio=IRR/volatility
Maximum drawdown
IRR/MDD
Days to recover from MDD
Correlation to equity
Correlation to bonds
Stress correlation to equitie
Stress correlation to bonds
Low beta portfolio

99.5%
10.7%
5.0%
2.16
-5.6%
1.93
120
-1.6%
7.6%
-37.5%
8.8%
Table 2.23: Summary statistics for the low beta portfolio (DB [2015]).
A deeper analysis of the correlation structure of the different risk factors reveals
that they can be clustered into three broad groups (see Figure 2.37). Data suggest that
negatively correlated risk factors to equity risk become even more negatively correlated
in turbulent markets and the same holds for positive correlated factors. We therefore
group the factors into three clusters and allow for timing.
DB (2015) states:
High beta, higher information ratio factors. These factors exhibit high information
ratios but also contain some equity market risk. Typically factors explained by riskbased effects, they are usually high-conviction strategies with strong evidence for
persistence. Examples include FX carry and rates implied versus realized volatility.
Low beta, stable correlation factors. Factors with moderate correlation levels which
are typically stable. Information ratios may be high (e.g., equity value) or low (e.g.,
FX value). Includes typically carry strategies and value strategies.
Negative beta, lower information ratio factors. Factors that exhibit negative correlations to equity markets that may be stable or may become more negative in periods
of stress. Typically idiosyncratic factors such as equity quality or rates Eonia vs
6m basis.
The portfolio construction is as follows. The timed portfolio is always invested long 50
percent in the neutral beta portfolio. This stabilizes the risk and return characteristics
193
Figure 2.38: Factor clusters. The left red box is the negative beta cluster, the neutral
cluster is in the middle box, and the positive beta cluster is on the right. Negative beta
risk factors are equity quality, rate momentum, FX momentum, credit momentum, and
commodity momentum. Neutral beta risk factors are equity value, rates muni/Libor, FX
value, and credit carry. Associated to the positive cluster are equity low beta, FX carry
rates, equity and commodity volatility, commodity carry, and equity dividends (DB and
ZKB [2015]).
of the timed portfolio. The other 50 percent are shifted according to the signal in the
negative or positive beta portfolio. The signal is rule based. The rule compares the 20d
average return with 60d average returns of MSCI World. If the 20d average exceeds the
60d one, then thwe positive beta cluster is activated; otherwise the negative beta cluster
is activated. The rebalancing of the individual factors in the clusters is carried out every
three months. The portfolios face a target volatility of 8 percent and leverage is 5 for each
factor and individual factors are capped at 20 percent. Finally, we use inverse volatility
weights for all risk factors allocaiton. Figure 2.39 summarizes the results.
In the upper-left panel the statistics for the three clusters are shown. Average correlation in the case of negative beta is negative to MSCI World and strongly positive in
the positive beta cluster. This is a promise that in falling stock markets there will be
diversification, while in rising markets the positive beta cluster participates. The timed
portfolio has an even more negative correlation with MSCI World. The lower panel shows
that the timed portfolio is able to resist heavy market turbulence and that it is also able
to participate in bullish markets.
194
Figure 2.39: Upper-left panel - statistics for the three factor cluster portfolios. Negative
beta risk factors are equity quality, rate momentum, FX momentum, credit momentum,
and commodity momentum. Neutral beta risk factors are equity value, rates Muni/Libor,
FX value, and credit carry. Associated to the positive cluster are equity low beta, FX
carry rates, equity and commodity volatility, commodity carry, and equity dividends.
The statistics shows that the negative cluster is negatively correlated to MSCI World and
that the positive beta cluster is strongly correlated to MSCI World. This is in line with
the observation that negative and positive correlations become more negative or positive
when markets are in turbulent times or booming. Upper-right panel: Statistics for the
timed cluster portfolio. The Sharpe ratio is remarkable and the average stress correlation
with MSCI is strong. Lower panel - The shaded regions show when rule-based shifts from
positive to negative cluster correlation are detected. In the dark region, negative cluster
correlation is activated. The results show that the detection works well for the GFC in
2008 and the European debt crisis in 2011. As a result the timed cluster portfolio graph
does not suffer from any negative performance in this period. The simple mechanism
seems to switch fast enough between negative and positive cluster signals. That is, the
timed portfolio also switches fast to the positive cluster after the GFC and the European
debt crisis when stock markets recover. Therefore, the timed portfolio not only provides
protection when markets are turbulent but also allows for participation in booming stock
markets. (DB and ZKB [2015]).
We conclude this section by comparing a low volatility portfolio of risk premia of JP

Morgan - the 7.5% target volatilty index with BB ticker XRJPBEBE - with the MSCI
world, see Figure 2.40.
JPM
MSCI
Great Financial Crisis

15.86%
-42.86%
XRJPBE5E - 7.5% Volatility Target

Jan
Feb
2006
2007
2.04%
-1.62%
2008
-1.46%
8.20%
2008 MSCI
2009
0.15%
2.61%
2009 MSCI
-8.63%
-10.02%
2010
1.54%
2.28%
2011
-0.14%
0.74%
2011 MSCI
2012
2.93%
1.65%
2013
2.44%
4.78%
2014
-1.85%
1.55%
2015
0.61%
0.72%
2016
3.17%
1.36%
2016 MSCI
-6.09%
-0.90%
195
European Debt Crisis

JPM
13.70%
MSCI
-11.90%
Mar
2.13%
1.35%
-0.67%
1.41%
-0.90%
0.09%
2.86%
0.87%
1.24%
-1.37%
7.16%
Stress Q1 2016
JPM
3.16%
MSCI
0.02%
Apr
-0.48%
3.14%
-2.49%
5.31%
-0.34%
May
-2.41%
2.63%
1.79%
1.16%
2.27%
Jun
-0.27%
-0.34%
2.19%
-8.34%
1.18%
Jul
1.30%
-0.27%
1.22%
-2.72%
1.59%
Aug
1.63%
1.15%
0.21%
-2.35%
0.00%
Sep
0.57%
1.59%
-1.10%
-12.68%
2.65%
Oct
2.67%
2.09%
3.85%
-19.91%
3.10%
Nov
-0.06%
0.87%
3.93%
-6.80%
1.61%
Dec
2.79%
3.40%
6.26%
3.47%
1.77%
Year
5.78%
18.01%
25.15%
1.27%
4.60%
3.86%
1.79%
1.44%
0.37%
-0.53%
-2.93%
1.92%
-2.52%
3.49%
-2.33%
0.99%
2.04%
2.45%
-0.53%
-1.75%
2.38%
-2.78%
1.82%
-2.01%
0.99%
1.83%
-1.73%
3.98%
1.17%
0.05%
3.04%
2.10%
1.05%
-7.53%
-0.72%
-2.00%
1.36%
-4.43%
1.47%
3.91%
-9.65%
2.23%
1.38%
0.90%
5.19%
3.16%
-0.90%
10.62%
-0.07%
1.43%
-1.79%
-0.69%
-1.19%
1.82%
-3.21%
1.99%
2.99%
3.32%
1.82%
1.87%
2.51%
15.23%
16.93%
-0.84%
0.03%
-0.55%
-1.57%
20.45%
11.70%
7.12%
5.20%
3.14%
16.20%
Figure 2.40: Top panel. This shows the return of the two indices starting in Jun 2016
for the next ten years. The middle statistics shows the cumulative returns of the two
indices for three stress events. The bottom panel shows the monthly returns of the JPM
index and for the three stress events - GFC, EU debt crisis, Q1 2016 - the returns of the
MSCI are also shown. (JPM [2016]).
The top panel shows that investing world wide diversified did not provided any positive return in the ten year investment period if the concept of asset diversification is only
used. The JPM index, contrary showed an impressive performance - which one should
consider in more details too. First, we can see it by eye that the risk premia performance
slope is not the same in the ten years period: After the GFC 2008 until the end of 2012,
the returns were largest with very low risk. Then for about one and a half years there was
a stand still period which was followed by a positive return period with larger risks - the
return chart is more zigzagged than in previous years. If we compare the performance of
the JP Morgan Index with MSCI in three stress periods - GFC, EU debt crisis and Q1
2016 - we observe that the risk premia index did well compared to the MSCI in the GFC
and the EU debt crisis: The construction mechanics to be uncorrelated to traditional
asset classes in general and negatively correlated in market stress situations works in
these periods. In the Q1 2016 event, things are more complicated. While the same can
be said for Jan and Feb 2016, the March data show that the risk premia index largely
underperformed the MSCI. The interesting point is not that this happened in a month it would be strange if such a pattern never occurs - but to understand the reasons. From
an asset class perspective, there was a sharp and fast rebound of stock markets after the
ECBs president Draghis speech. This rebound was to fast for the risk premia index in
196
the sense that there was no rebalancing (quarterly basis) which could adjust the weights
in the risk premia. Second, the speech of Draghi also affected credit risk premia in a
way which is the exception rather than the rule: The credit spread tightening was more
pronounced for the Itraxx Europe Main index than for the Crossover index of the same
family. This means that risk factors collecting the credit risk premia generated negative
returns since they were wrong the long and the short risk premia portfolios in this period.
Finally, a similar remark applies to interest rate risk premia which lead to the negative
risk premia return in March.
2.9.3.1
Quality Factor Construction
We describe in some detail how the ARP Volatilty Carry for US equity can be constructed.
This index tracks the performance of rule-based volatility-selling trading strategy. The
characteristic of this strategy is the difference between implied and realized volatility in
the S&P500 index which is the underlying liquid asset.
Rationale and Strategy
Studies show that the future realized volatility for equity indices is consistently overestimated the market. One reason is that the use of derivatives is very popular for
equity hedging - there is a significant excess demand of equity hedgers. The strategy
EQ volatility aims to exploit the typically (but not always) positive difference between
implied and realized volatility on equity index by applying a derivative strategy. The
standard method is to sell on a daily basis call and put options (straddle strategy on the
index) and by buying the delta hedge of the strategy at the same time.
Implementation
The volatility selling strategy means that rules based investments are made in CBOtraded call and put options. The index implements the volatility selling strategy by
notional investing daily in different options on the S&P500. Typically, pairs of calls and
puts with a few month maturity are sold where the option strike is at the money (liquidity). On the opposite side, the index buys the delta hedge of this option exposure
which should hedge the directional exposure of the option portfolio. This means that
the index takes daily a long or short exposure to the underlying index. The index receives a notional premium for each notional investment in the relevant options and the
performance of the index depends on the difference between the premiums received from
the options and the payout on the options at expiry of the option contracts, and the
cumulative profits or losses derived from the notional daily delta hedging strategy. The
index is denominated in USD.
We consider some details of the index calculation. The index level It is calculated in
USD with the starting level I0 = 100 at the commencement date. The index level It at
197
any future date t is then:

It = I0 + Casht,P&L MtMt + t
(2.128)
where the second term is the Cash P&L, the third one the portfolio mark-to-market and
the last one the delta hedge. Before we enter into the details of these terms, two issues
are first addressed. First, the index notionally invests in options which means that a
number of units U of each option are sold daily. This unit exposure U is defined as the
ratio between the index level and the option vega times and adjustment factor. The vega
means the vega implied by the Black-Scholes formula. The adjustment factor considers a
day-count number and a scaling factor. The second input are the use of time-weighted average price (TWAP) observations. The TWAP is needed to establish relevant prices and
inputs to calculate the index level on any calculation date. That for, the relevant price or
level will be recorded at the end of every 15 second interval. The TWAP process returns a
price or level which is the arithmetic average of the recorded prices or levels. The TWAP
will be applied to the call and put options, the forward price F I to an exercise data will
be calculated by in accordance with the put-call parity formula and to the cash positions.
The Casht,P&L is equal to the total accumalted premia P from the options minus
the accrued settlement values S, both multiplied by the unit exposure, plus the accrued
interest AI:
X
X
Casht,P&L =
Uj Pj
Uj Sj + AIt
(2.129)
jT O
jEO
with T O the set of all traded options and EO the set of all expired options. The premium
is defined by
Pt = TWAPOt U C
(2.130)
wiht O representing either a call or a put option, a function which is +1 (1) if U > 0
(U < 0) and zero else and C the option premium cost spread. This spread is proportional
to the option vega times a option volatility value which is floored.
The portfolio market value MtMt is equal to the sum of all option units U times
the option close prices. The delta hedge t means notionally entering into a long or
short position in a Total Return Swap (TRS) on the total return underlying index. The
following dynamics hold:
t = t1 + t,M tM t,CC t,EC .
(2.131)
The last three terms are zero at the index live date but non-zero at all index calculation
dates. The term with CC are the delta hedge costs, EC the cost at expiry and the M tM
term reflects the evolution the accumulated gross return difference of the delta positions
valued with the TRS and with an interest rate index, respectively.
198
2.10
Views and Portfolio Construction - The Black-Litterman

Model
In the mean-variance model, the CAPM, and the risk-based portfolio constructions, the
views of the investors did not matter. But most investors have some views about specific
assets and they wish to apply these views to their asset management. For example high
past returns may not be the same in the future and the asset manager would like to
correct this by implementing a prior view in the model. By doing this, the investor
hopes that the model becomes more robust - weights become more stable - and that it
generates additional returns. Investors do not want their view in an ad hoc way. Most
investors would like to incorporate their views consistently into an investment model.
The logic is as follow:
Start with a model output - this is the prior.
Add views.
Update the prior to a posterior using Bayes rule.
There are many different approaches to, and a myriad of academic papers on, how
views can be used in portfolio construction. We consider the Black-Litterman (BL) model
(BL [1990]) to be the first, and still the most popular, model used by practitioners. A
second model will generalize risk-based portfolio construction to allow for investment
views, and - finally - the so-called entropy pooling approach, which is more general than
the BL approach, will be discussed.
For further reading, in addition to BL, we cite Walters (2014), Satchell and Scowcroft
(2000), Brand (2010), Meucci (2010), Idzorek (2006), Herold (2003), and He and Litterman (1999).
We start with the so-called mixed models since all following examples fall into this
class.
2.10.1
Mixed Models Logic
We explain first the logic of a general mixed investment model. We start with the IID
normally distributed excess return vector R with mean and covariance C, that is
R M V N (, C), with MVN denoting the multi-dimensional normal distribution and
the distribution symbol. The investor considers a benchmark belief about the risk
premia - that is, is itself an MVN (the prior distribution):
M V N (, C ) .
This starting belief about the risk premia can follow from a model such as the CAPM,
any empirical analysis, or dated forecasts.
2.10. VIEWS AND PORTFOLIO CONSTRUCTION - THE BLACK-LITTERMAN MODEL199

The investors view or forecasts are modeled as follow. His has set of views about
a subset of K N linear combinations of returns P , where P is a K N matrix
selecting and combining returns into portfolios for which the investor is able to express
views. These new views are unbiased random variables described by the conditional
distribution
| M V N (P , )
with the forecast error covariance. Summarizing, the model inputs are , P , and .
Since the views possibly contradict the prior, the prior does not satisfy the views.
Therefore, a search for a new, suitable distribution Dist(|) - the posterior - is required
(Dist meaning distribution). Using Bayes theorem,
Dist(|) Dist(|)Dist() = M V N (E(|), var(|) =: M V N (, C) ,
where the posterior moments can be explicitly derived. The posterior mean is equals
to a weighted average and . This mixed estimation setup allows to input forecasts of
linear combinations of risk premia.
2.10.2
Black-Litterman Model
The two significant contributions of BL model to the asset allocation problem are:
The equilibrium market portfolio serves as a starting prior for the estimation of
asset returns.
It provides a clear way of specifying an investors views on returns and of blending
these views with prior information. There is a large degree of flexibility in forming
the views. The investor is not forced to have a view for all assets and the views
can span arbitrary combinations of assets.
The first step in the BL model is the definition of the reference model. This model
defines which variables are/are not random and defines parameters which are modeled/
not modeled. The returns of asset R are normally distributed with unknown mean
and covariance C, where M V N (, C ). The covariance of the returns CR about the
estimate is then given by
CR = C + C .
(2.132)
Summarizing, the reference BL model is given by the returns R M V N (, CR ). The mean
represents the best guess for , and the covariance C measures the uncertainty of the
guess.
How do we fix , the prior estimate of returns - that is to say, the returns before we
consider views?
BL uses a general equilibrium approach. Why? If a portfolio is at equilibrium of
supply and demand in the markets, then each sub-portfolio must be at equilibrium too.
200
Therefore, an equilibrium approach for the return estimate is independent of the size
of the portfolio under consideration. Although there is no restriction as to which of
the many equilibrium models should be used, Black-Litterman and many other use the
CAPM or any factor model generalization in the following reverse engineering way.
Using the CAPM means that all investors have a mean-variance utility function.
Without any investment constraints, the optimal strategy maximizes the expected
utility given in (2.55)
E(u) = 0 0 C ,
2
where we have replaced the expected returns by the unknown expected return estimate .
The solution gives us the optimal strategy as a function of the return and covariance: =
1 1
.
C
Given the equilibrium strategy in the CAPM - the reverse engineering part - we
immediately get the excess return estimate
= C .
(2.133)
How do we fix the risk aversion parameter? Multiplying (2.133) with the market portfolio
0 implies that
2
RM Rf = M
(2.134)
with RM the total return of the market portfolio. In other words, the risk aversion
parameter is equal to the market price of risk.
Using (2.134) in (2.133), the CAPM specifies in equilibrium the prior estimate of
returns .
How do we estimate the variance of the mean - that is, how do we fix C ? BL
assume the proportionality
CR = C
(2.135)
with the constant of proportionality factor. The uncertainty level can be chosen
proportional to the inverse investment period 1/T . The longer the investment horizon
is, the less uncertainty exists about the market mean; the higher the value of , the less
weight is attached to the CAPM. Summarizing, the prior return distribution is a normally distributed random variable with the mean given in (2.133 and variance (1 + )C.
This concludes the first step in the BL model.
We consider next the insertion of views, for which we follow Walters (2014).
A view is a statement on the market. Views can exist in an absolute or relative form.
A portfolio manager can, for example, believe that the fifth asset class will outperform
the fourth. BL assumes that views ...
apply linearly to the market mean ,
face uncertainty,
2.10. VIEWS AND PORTFOLIO CONSTRUCTION - THE BLACK-LITTERMAN MODEL201

are fully invested (the sum of weights is zero for relative views or one for absolute
views), and
do not need to exist for some assets.
More precisely, an investor with k views on N assets uses the following matrices:
The k n matrix P of the asset weights within each view.
The k 1 vector Q of the returns for each view.
The k k diagonal matrix of the covariance of the views, with nn the matrix
entries. The matrix is diagonal as the views are required to be independent and uncorrelated. The inverse matrix with the entries 1/nn are known as the confidence
in the investors views.
Example
Consider four assets and two views. The investor believes that asset 1 will outperform
asset 3 by 2 percent with confidence 11 and that asset 2 will return 3 percent with
confidence 22 . The investor has no other views. Mapping these views into the abovedefined matrices implies

11 0
2
1 0 1 0
, =
.
(2.136)
, Q=
P =
0 22
3
0 1 0 0
BL assumes that the conditional distribution P (View|Prior) mean and variance in

view space is normally distributed with mean Q and covariance .
Two main tasks remain: First, how is specified and second, the search for a posterior distribution of the returns that blends the above prior and conditional distribution.
There are several ways of specifying . One can assume that the variance of the views
will be proportional to the variance of the asset returns, one uses a confidence interval
or one uses the variance of residuals if a factor model is used. We refer to Walters (2014)
for details.
We consider the second task and use Bayes theorem, where a posterior distribution for the asset returns follows conditional on the views and the prior distribution.
Since the asset returns and views are normally distributed, the posterior is also normally
distributed.
The posterior normal distribution of asset returns in the BL model is then given by
the Black-Litterman master formula for the mean returns and the covariance C
= + CP 0 (P CP 0 + )1 (Q P )
1
C = (( C)
+P
P)
(2.137)
202
C is the posterior variance of the posterior mean estimate about the actual mean. It
measures uncertainty in the posterior mean estimate.
Several consistency checks can be applied to (2.137): First, if vanishes, which
means absolute certainty about the views, then the posterior mean becomes independent
or insensitive to the parameter . Next, if the investor has a view on every asset, the matrix P becomes invertible. Since the covariances are by definition invertible the posterior
mean equation simplifies to = P 1 Q. Finally, if the investor is fully uncertain about
the validity of his or her views - that is to say, the matrix entries of tend to infinity,
there is no value added by adding any views to the model since the prior and posterior
return distribution agree: = .
Example
Figure 2.41 shows two views - Canadian vs US equities and German vs European equity markets. The view in the American markets is much more diffuse than its European
counterpart because the variance of the estimate is larger or the precision of the estimate
is smaller. The figure also indicates that the precision of the prior and views impacts the
precision of the posterior distribution.
We refer to Walters (2014) for a discussion of alternative reference models, the impact
of , some extensions of the Black-Litterman model, and the sensitivity of views.
The technique developed by BL provides a framework in which more satisfactory
results are obtained from a larger set of inputs than are obtained using the mean-variance
framework. The model is usually applied to asset classes rather than single assets. In the
BL approach, the impact of the views on the asset returns is weighted by the confidence
of the investor in his or her views. Besides generating higher returns, the hope is that a
BL model leads to more stable portfolio allocations over time.
2.11
Active Risk-Based Investing
Risk-based investing often faces the criticism that it cannot allow for views. This is not
true. We extend the pure risk-based model to allow for an active investment view and
follow Jurczenko and Teiletche (2015) and Roncalli (2014).
2.11.1
Implicit Views
The mean-variance optimal portfolios, see (2.55) and (2.56), imply that for any given
portfolio and any covariance matrix C there exists a vector of implied returns given
by
= C .
2.11. ACTIVE RISK-BASED INVESTING
203
Figure 2.41: Probability distributions for the prior, view, and posterior in the BL model
in the application of He and Litterman (1999). The left panel shows the view that
Canadian equities will outperform US equities by 4 percent and the right panel the view
that German markets will outperform European markets by 5 percent (He and Litterman
[1999] and Walters [2014]).
The risk-aversion parameter is equal to the Sharpe ratio (SR) per unit of volatility risk
- that is to say, we get the implied vector of returns or views
=
SR
C .
This reads for the single asset k

k =
SRp
cov(k, p) .
p
with the covariance of the k-th asset returns with the portfolio returns and p the volatility of the portfolio. If we choose a risk-based portfolio following the generic rule (2.119),
the implicit risk-based view reads
!2
C(k )
RB,k = SR
k
(2.138)
C(
where k is the average pair-wise correlation of asset k and SR is the average Sharpe
ratio across assets. This shows that the pure risk-based investment approach incorporates
implicit views.
204
2.11.2
Active Views
To introduce active views, the authors make reference to the BL methodology. The
main change is that the reference portfolio is not an equilibrium market portfolio: the
risk-based portfolio is the reference. This (strategic) passive portfolio will be modified
to reflect a tactical or active view. The joint risk-based and active portfolio forms the
active risk-based portfolio.
In the same way as in the BL model, the individual excess returns are assumed to be
normally distributed around their implied risk-based estimates: RB and C. Also, the
investor is able to provide a complete set of absolute views on the individual expected
excess returns, which again are normally distributed, with mean V iew and (1 )C the
variance of the views.
The active risk-based expected returns ARB are then equal to the linear combination:
ARB = RB + (V iew RB ) .
(2.139)
Substituting (2.139) into the solution of the unconstrained mean-variance program and
identifying terms, the resulting active risk-based portfolio weights phiARB are equal
to a linear combination of the risk-based (passive) weights and the view weights, which
corresponds to the maximum Sharpe ratio (MSR) portfolio associated to the views (see
Jurczenko and Teiletche [2015] for further details).
We mention that the active portfolio deviations do not necessarily sum to zero, meaning that the active portfolio deviations are not cash-neutral. This could then lead to
underinvestment or leveraged positions. If this is not desired, a constant shift of the
vector of expected returns implies a cash-neutral, active, risk-based portfolio.
Example
There are three assets. Asset 1 has a volatility of 20% and an expected excess return
of 10%, asset 2 and asset 3 have both 10% volatility and 5% expected excess return.
Hence, the Sharpe ratio is the same for each asset. The correlation matrix is assumed to
be of the form +0.5 between asset 1 and 2 and 0 between all other combinations of assets.
To calculate the risk-based portfolio we consider the generic allocation formula (2.119)
with a capital-budgeting constraint adding up to one - that is,
1
1 = 0.4 .
k k
k=P
We then get the following figures:
2.11. ACTIVE RISK-BASED INVESTING

Asset
1
2
3
Portfolio
20
10
10
8
RB
20
40
40
100
0.25
0.25
0
0.17
C()
0.71
0.71
0.58
0.67
205
RB
11.25
5.63
3.75
6
SR
0.56
0.56
0.38
0.75
Risk Allocation
37.5
37.5
25.0
100
Table 2.24: Risk based allocation.
The concentration factor C() measures the lack of diversification potential where in
our case N = 3 and = 0.17.
The risk-based portfolio weights are based on the inverse of volatilities and the scale
parameter k. The risk-based implied returns are the excess returns that are consistent
with the mean-variance optimality of the risk-based portfolio. Implied Sharpe ratios are
defined as the ratio between implied returns and volatilities. Risk allocations correspond
to the percentage risk budgets.
Implied return is higher for the first asset as its high volatility/high correlation profile
necessitates large expected rewards for mean-variance investors that hold it. The implied
return is lower for the third asset, which displays both low volatility and low correlation.
The second asset constitutes a middle ground. Since assets with poorer diversifying
properties require higher returns in an optimal portfolio, the implied Sharpe ratios of
the first two assets are higher than that of the third. This shows that the optimality of
the risk-based portfolio does not necessarily mean that individual implied Sharpe ratios
must be equal. The risk allocations are above 1/3 for the first two assets due to their
above average pair-wise correlations.
The investor has the following active absolute views on the three assets:
5%, 5%, and10%. Using these views does not lead to a cash-neutral, active, risk-based
portfolio. As mentioned above, a parallel shift has to be calculated and added to the
view. The shift is 2.81% and hence the modified views 2.19%, 2.19%, and 7.19% follow.
This implies that the third asset is supposed to post a higher return than its implied
return. The reverse holds for assets 1 and 2. Table 2.25 shows the key figures for the
active views.
Asset
1
2
3
RB
11.25
5.63
3.75
Active views
2.19
2.19
7.19
Active RB
5.81
3.56
5.81
Active RB
8
30
62
Risk allocation
9.1
20.8
70.1
Table 2.25: The figures for the active risk-based portfolio. The confidence level is =
60%. The figures follow from (2.139).
206
Consistent with the views, the resulting active risk-based portfolio weights and risk
allocations are increased for the third asset at the expense of the two others.
2.12
Entropy Pooling Model
The notion of entropy pooling comes from Meucci (2010), upon whose work this section
is based. The word entropy is a fundamental concept in the natural sciences, communication technology, statistics, and - increasingly - in finance.
The goal of Meucci is to allow for arbitrary views in portfolio construction and not
only linear ones as in BL. Hence, entropy pooling generalizes all formerly discussed models.
In the first step, similar to in BL, the starting point is an arbitrary prior distribution
p0 for a set of risk drivers or risk factors. The second step is to incorporate more general
views than in the BL model. This means allowing for views not only about returns but
also about correlations, tail risk, etc. The prior could represent a regular regime in the
markets, and the views/stress test could be a regime in which some of the correlations,
or all of them, increase substantially. Therefore, views and stress tests are constraints
on the yet-to-be-defined posterior of the market. We write pv for the distribution that
satisfies the view constraints.
Since the views possibly contradict the prior, the prior does not satisfy necessarily the
views. Therefore, a search for a new, suitable distribution - the posterior - is required. To
compute the posterior, we rely on the relative entropy S(pv , p0 ) between the prior p0 and
the constrained distribution pv . The posterior distribution pp is then a view-constrained
distribution that minimizes the relative entropy.
Entropy pooling can be implemented in two ways: the non-parametric and the parametric approach. Typically, the posterior distribution cannot be calculated explicitly.
An important exception is the case where all distributions are normal. Then, the posterior distribution can be calculated explicitly using the above three-step procedure. If
the views are the same as in the BL model, the Black-Litterman master formula (2.137)
for the posterior returns follows.
2.12.1
Factor Entropy Pooling
We reconsider the BL model where we use factor entropy pooling to determine the implied returns - that is, the distribution consistent with an optimal target portfolio. We
recall that in the BL model the covariance matrix fits to the empirical observations and
the implied equilibrium returns given in (2.133) are calculated. Then, the views are
inserted and the master formula of BL is derived. This approach faces two problems.
2.13. CIO INVESTMENT PROCESS
207
First, there is no estimation error for the covariances. This is a serious problem if the
estimated covariances change suddenly due to specific events. The SNB decision in January 2015 to remove the EUR-CHF floor, for example, led - within a day - to a new
covariance matrix for these currencies, for equities compared to Swiss equities, and for
interest rates. Second, the equilibrium means (typically derived from the CAPM) can
differ substantially from the data.
Factor entropy pooling is addressed to consider these two problems. The starting
point is the linear factor model (2.93) , which implies the covariances in (2.95) - that is,
C = AIA0 +D. We define the entropy by the two normal distributions S([, C)], [p , Cp ]),
where the indexp denotes the prior of the normal distribution, and C the covariance
matrix of (2.95) . The entropy function can be calculated explicitly (see Meucci [2010])
and depends on three parameters: the returns, the factor loads A, and the residual
covariance matrix D
The posterior returns , the factors A, and the residual covariance matrix D follow
by minimizing the relative entropy function over the set of all admissible views.
Example - Factor Entropy Pooling
The following example from Meucci shows the benefits of using factor entropy pooling
instead of the BL model. We consider a market of N = 30 equities in the Dow Jones
Index, and weekly prices from January 2002 to June 2012. Meucci calculates the historical
mean and the historical covariance of the weekly returns. The market capitalization
weight is taken as of June 27, 2012 and the factor model assumes that there are three
factors.
Figure 2.42 shows the results for market capitalization, sample means, BL, and the
factor entropy approach. It follows, at first sight, that the BL and the factor entropy
models deliver qualitatively similar weights, which differ from the sample mean and
the market capitalization weights significantly. The entropy is calculated for the factor
entropy model and the BL model. This leads to an entropy value of 1.83 for the factor
entropy model and of 2.41 for the BL model. Therefore, the factor entropy pooling
parameters are more in line with the historical parameters than are the BL parameters.
2.13
CIO Investment Process
A Black-Litterman-oriented investment process would have at least the following steps

(Walters [2014]):
Determine which assets constitute the market.
Compute the historical covariance matrix for the assets.
Determine the market capitalization for each asset class.
208
Figure 2.42: Weights for the Dow Jones using, from left to right, market capitalization
weights, sample means, the BL model, and the factor entropy pooling approach (Meucci
[2010]).
Use reverse optimization to compute the CAPM equilibrium returns for the assets.
Specify views on the market.
Blend the CAPM equilibrium returns with the views using the Black-Litterman
model.
Feed the estimates (estimated returns, covariances) generated by the Black-Litterman
model into a portfolio optimizer.
Select the efficient portfolio that matches investors risk preferences.
But these steps too only define a part of the investment process of a CIO. In general, the CIO receives information from different sources as a first step in the investment
process: A macroeconomic view from research analysts, market information, chartist information and valuation information.
Assume that one output of this information is to overweight Swiss stocks - underweight European stocks.
This defines a pair-wise bet. All bets of this type form the tactical asset allocation
(TAA). Several questions follow:
2.13. CIO INVESTMENT PROCESS
209
A How strong is the bet - that is to say, how much should the two stock positions deviate
from the actual level overweight Swiss stocks - underweight European stocks ?.
B Should any possible currency risk in the bet be hedged?
C How long should this bet last?
D How confident is the CIO and his or her team about the bet?
E Is the bet implementable and what is the precision of such an implementation measured by the tracking-error?
F Will there be a stop-loss or profit-taking mechanism once the bet has been implemented?
G How does the CIO measure the performance of the bet?
The approach to question A is often based on the output of a formal model. That
is to say, a risk budgeting model, a BL model, or a mean-variance optimization model
proposes to increase Swiss stocks by 5 percent and to reduce the European stock exposure by 5 percent. It is common practice that such a proposal is overruled by the CIO,
either because it creates too much turnover for the portfolio managers or because he or
she considers, subjectively, such a change to be too strong.
Question B is - among other things - a consistency question since, on the one hand,
the +/ 5 percent increase in equities also changes the FX exposure of the whole TAA
and, on the other hand, there could be a CHF-EUR bet following from the many information sources. Typically - and pertaining to question C - bets are made for one month.
This is the standard time after which the CIO and his or her team review the TAA.
Question D is sometimes called the information risk issue. Information risk is different
from statistical risk. The most well-known statistical risk measurement in the industry
is the tracking error, which measures the volatility of alpha over a period of time. The
risk source is the market, counterparty, and liquidity risk of the assets. Bernstein (1999)
defines information risk as the quality of the information advantage of a decision-maker
under uncertainty.
Reconsider the above Swiss stock-European stock bet. This view must be driven by
our information set, as well as by the proprietary process of analyzing the information
and data. To evaluate information risks, we ask (Lee and Lam [2001]):
What is the completeness and timeliness of our information set?
Have we missed something?
Have we misinterpreted something?
How confident are we about our models and strategies?
210
These questions suggest that some information risks may be quantified with a good deal
of precision while in most cases precise measurement of information risks seems impossible, and well-informed judgment may be necessary. This may result in a final statement
on the decision-makers confidence of adding alpha. If, say, the confidence is 50 percent, we are not confident at all about the bet. The probability of adding a positive
alpha by implementing the Swiss stock-European stock bet is the equivalent of flipping a
fair coin. A standard approach to measuring the performance of bets is the hit rate (HR).
A hit rate of 60 percent means that we add alpha in 60 percent of the months in
which we make an active bet. The confidence in adding alpha can be interpreted as the
expected value of the hit rate. Information risk is then quantified by the expected hit
rates of our investment views, or strategies.
Example
We follow Lee and Lam (2001). They assume that alpha is symmetrically distributed
around its mean - that is to say, alpha is normally distributed around its mean value.
Then, there is a unique one-to-one mapping between the hit rate and the information
ratio. To derive this relation, we have for i of asset i which follows a normal distribution:
HR = P (i > 0| N (, TE)
with the arithmetic average alpha and T E the standard deviation. The last formula
reads after a change of variables:
Z
1 2
1
HR =
e 2 y dy
2 alpha
TE
with x =
i
TE .
Using the definition of the information ratio, we get:

Z
1
Hit Rate (HR) =

f (y)dy,
2 IR
(2.140)
with f the standard normal density function and IR the information ratio. Once the
expected alpha and expected tracking error, and therefore the expected information
ratio, are stated, the complete ex ante distribution of alpha is specified. The hit rate is
the area to the right of 0% alpha. Using the square-root law the following information
risks, confidence levels, and information ratios follow:
To incorporate the views in a systematic way one chooses the BL model or the more
flexible entropy pooling approach of Meucci. One is free to choose a market portfolio, a
benchmark index, or a (passive) risk budgeting portfolio as the reference portfolio, which
is then used together with the views to create the posterior distribution.
2.14. SIMPLICITY, OVER-SIMPLICITY, AND COMPLEXITY

Information risks
Low
Medium
High
Infinity
Confidence (monthly HR)

60%
56%
52%
50%
Monthly IR
0.25
0.15
0.05
0
211
Annualized IR
0.88
0.52
0.17
0
Table 2.26: Information risks, confidence levels, and information ratios (Lee and Lam
[2001]).
Confidence in the views is built into the entropy pooling approach as follows. Let P
be the prior and Q the posterior distribution. If we have full confidence in the posterior,
we end with the BL master formula or the entropy pooling model. If the confidence is less
than full, the posterior distribution Q must shrink toward the reference P . Introducing
the confidence level parameter c, which lies between 0 and 1, we write for the probability
distributions F :
F (c) = (1 c)FP + cFQ .
When the confidence is total, the full-confidence posterior is recovered. This kind of
opinion pooling can be generalized to the case in which many managers - the CIO and
his or her team - have different degrees of confidence about the posterior (see Meucci
[2010b] for details).
The CIO investment process face the same critical question as a mutual fund - how
strong are luck and skills in generating the performance?
2.14
Simplicity, Over-simplicity, and Complexity
Faber (2007) proposed a simple trend-following tactical asset allocation strategy and
showed that the strategy outperformed the market. The simplicity of the model and
the performance with no losing years from 1972 to 2007 made Fabers work well-known.
In 2015, the paper was the most downloaded paper (more than 160,000 downloads) on
the largest social science research network SSRN. But as of October 2015, only 15 other
researchers had cited the paper: Many researchers have downloaded the paper, but there
is little feedback in academic terms (number of citations). Why does the most downloaded paper on the subject of making money by investing receive only weak academic
feedback? A simple answer would be that Fabers model is the right one. Then, since
scientists cannot publish - in finance - a successful replication of anothers work, there is
simply nothing more to be said (citation bias).
Marmi et al. (2013) develop statistical tests to gain insight into whether the Faber
trading strategys success has predictive power or whether data snooping drives the performance results.
212
Example - Data Snooping

Data snooping means broadly that one finds seemingly statistically significant results
that, in fact, turn out to be spurious patterns in the data. This is a serious problem in
financial analysis.
The first data snooping example is from Andrew Lo (1994). Consider the following
mathematical proposition of Fermat regarding prime numbers. For any prime number
p, the division of 2p 1 by p always leads to a remainder of 1.
For example, dividing 213 1 by 13 implies 315 plus the remainder of 1. This holds
for all prime numbers.
But the converse is not true. If the division of 2q 1 by q leads to a remainder of
1, it does not imply that q is a prime number. But the converse is almost true: There
are very few numbers that satisfy the division property and that are not prime. In the
first 10, 000 numbers there are only seven such numbers.
Consider the following stock selection strategy based on these rare numbers: Select
those stocks with one of these seven numbers embedded in the CUSIP identifiers.
A CUSIP is a nine-character alphanumeric code that identifies a North American
financial security for the purposes of facilitating clearing and settlement. Given the
aforementioned seven numbers, there is only one CUSIP code that contains such a
number: CUSIP 03110510, where the bold number - 1, 105 - is one of the seven rare
non-prime numbers.
The stock Ametek had exhibited, by the time of Los writing, extraordinary
performance: a Sharpe ratio of 0.86, a Jensen alpha of 5.15, a monthly return of 0.017,
and so on.
The problem with this strategy for selecting a stock is that there is no reason why it
should work. But to understand why it does work is essential when it is not possible to
test hypotheses by running repeated controlled experiments as is possible in the natural
sciences. This example shows that there can be highly non-linear effects - here the prime
number property, which can lead to spurious return patterns.
A second example considers order statistics. Assume that there are N securities with
annual returns that - for simplicity - are assumed to be equal standard normal distributed
with a mean of 10 percent and standard deviation of 20 percent and that are mutually
independent.
The probability that the return of security k exceeds 50 percent is then 2.3 percent.
It is unlikely that security k will show this strong return.
213
Consider the next question where we ask for the winner return - that is to say, the
probability that the maximum return will exceed 50 percent. This probability can again
be calculated and for N = 100 securities it is 90 percent.
Therefore, the probability that a given stock earns more than 50 percent is close to
zero but there will always be a winner if the number of stocks is large enough. Does
winning tell us anything about the nature of the winning stock? Since the stocks are
IID, no. Nothing can be inferred about the future return if one knows at a given date
which stock is the winner.
This example indicates that data snooping in investment is related to a focus on past
performance as a guide to future performance when one associates past success with
significant investment skills while it is merely luck that drives past performance.
2.14.1
The Faber Model
The simple moving average trading rule proposed by Faber (2007) between risky assets
and a risk-free asset reads: if the monthly closing price of the risky asset is higher than
its past 10-month average, buy the risky asset; otherwise buy the risk-less asset.
This timing model is applied to each asset of a diversified portfolio including US
stocks, the MS Capital International EAFE Index (MSCI EAFE), the GS Commodity Index (GSCI), the National Association of Real Estate Investment Trusts Index (NAREIT),
and US government 10-year treasury bonds. It leads to the following impressive results:
The portfolio has a better risk-adjusted performance than a reference portfolio of
equally weighted, yearly rebalanced assets.
Maximum drawdown is strongly reduced.
The performacne is positive for thirty-five consecutive years.
If these results are confirmed, the efficient market hypothesis - see Chapter 3 - has
been violated. This means that returns of liquid assets are predictable. Fabers 2007
paper was updated in 2013 with the inclusion of GFC period data until 2012. As Faber
states: Overall, we find that the models have performed well in real-time, achieving equity
like returns with bond like volatility and drawdowns.
2.14.2
Statistical Significance
While Faber (2013) published impressive updated figures he did not test for the statistical
significance of his model. This is the question Marmi et al. (2013) ask: Is Fabers strategy
violating the risk-return trade-off in a statistically significant way? The authors perform
different bootstrapping experiments from January 1950 to June 2009 (713 months). They
analyze the behavior of each asset class by investing in each asset or in the risk-free asset
214
according to Fabers trading rule.
Example - Bootstrapping Experiment

We have just one data set of the past. Computing a statistic using this data, we
only know that statistic and do not see how variable the result is. Bootstrap then
create a large number of data sets copies that we could have been also observed and we
compute the statistic for each data set. This gives a distribution of the statistic figures.
Bootstrapping therefore allows measurements of accuracy of sample estimates. It is a
resampling method, see Sullivan et al. (1999) for a reference.
As a first result, the authors provide the summary statistics given in Table 2.27.
Asset
S&P500
3M T-bills
Faber
Mean [%]
6.71
4.99
8.26
Standard deviation [%]

14.60
0.89
10.54
Maximum drawdown [%]

-50
-2
-22
Table 2.27: Summary of annualized statistics for the Faber model from January 1950 to
June 2009 using log returns (Marmi et al. [2013]).
These statistics confirms prima viste the superiority of Fabers strategy: Sbstantially
higher mean and a lower standard deviation than the S&P 500 portfolio and significantly
lower maximum drawdowns. Marmi et al. (2013) carry out the following bootstrapping
experiments.
Simple bootstrap. They replicate the S&P 500 returns by drawing (with replacement) 500 simulated time series from the observed returns. The T-bill rates are
unchanged.
Historical simulation. This allows the introduction of heteroscedasticity into the
data by using a GARCH(1,1) process for stocks.
Bivariate historical simulation. The GARCH(1,1) model is also estimated for the
T-bill rate increments.
The exercise is then repeated by considering not only the S&P500 but also the other asset
classes used in Fabers work. The authors then compare the risk-return profile cloud of
the simulation with the risk-return position of the Faber model given in the summary
statistics in Table 2.27. They find that besides the simple bootstrap for the S&P500 only
the summary statistic risk-return data point lies in the cloud of the simulated portfolios.
Figure 2.43 shows the result for the bivariate historical simulation.
The results indicate that the Faber models over-performance is not statistically significant. Given that the simulation methods are simple one should not make any final
215
Figure 2.43: Efficient frontier, the original Faber portfolio, and bootstrapped Faber portfolios from the above describe bivariate historical simulation. The Faber model is well
above the mean-variance efficient frontier but also lies inside the cloud of simulated data
(Marmi et al. [2013]).
conclusion about Fabers work, but it is nevertheless appropriate to state that one should
be cautious with regards to attributing statistical significance to it.
216
Chapter 3

We discussed in Chapter 2 several empirical approaches to portfolio construction. This
chapter starts is based on the economic theory or asset pricing point of view.
3.1
Modern Asset Pricing and Portfolio Theory
There are different economic approaches to deriving the pricing of assets and optimal
investment portfolios.
Absolute asset pricing where the optimal investment portfolio, the optimal consumption paths and the asset prices follow as an equilibrium result from an optimization of the investors preferences and market clearing.
Relative asset pricing (no arbitrage theory). This prices derivatives.
This chapter draws heavily on Cochrane (2005), Back (2010), Campbell and Viceira
(2002), Cochrane (2011), Culp and Cochrane (2003), Merton (1971, 1973), Martellini
and Milhau (2015), Schaefer (2015) and Shiller (2013).
3.1.1
Absolute Pricing
Absolute pricing means that investors solve a full-fledged economic model: they choose
their optimal consumption and investment portfolios over time to maximize their expected utility function by considering investment and consumption constraints. The
chosen strategy of consumption and investment is in equilibrium if utility is optimal for
all investors for this strategy and if all goods and financial markets clear.
The first condition means that no investor has an incentive to deviate from his or
her decisions. Market clearing means that demand meets supply in all consumption and
financial markets. Therefore, if an investors optimal strategy is to short an asset, there
is another investor who is optimally willing to buy the same asset. We always assume
that investors are rational: all investors act in their own best interests as they perceive
217
218
CHAPTER 3.
INVESTMENT THEORY SYNTHESIS
it, they have full information about the alternatives and they are not limited in their
cognitive ability.
Why is the equilibrium concept important for asset management? An asset management strategy that is not supported by an economic equilibrium is likely to die out
quickly. The prices of assets are for example determined by the behavior of the agents
in the economy - their preferences, impatience, endowments, and beliefs about the future. Therefore, considering equilibrium seriously means differentiating between realistic
strategies or products and pure fantasies. Unfortunately, relating investment models such
as factor models to the equilibrium concept is far from a simple task. It is fair to state
that, today, a lot of effort is made by academics to bridge the gap between equilibrium
financial economics and investment strategies.
The lack of persistence of proposed investment strategies is often seen in practice
when back-tests regarding the strategies are shown. Experienced managers then often
become suspicious: Ive seen so many strategies with convincing back-tests in the past
but the strategies failed afterward in practice. Using back-tests without an equilibrium
concept makes it difficult to control for data mining effects.
3.1.2
Simple General Equilibrium Model
We consider a very simple economy of two investors. The investors consider consumption
in a two-period world. Both derive utility from consumption at the two dates with the
same logarithmic utility function: their happiness to consume is valued in the same way
for each of them.
The two investors also possess the same endowment in the two periods - that is to
say, they earn the same salary. But the two investors differ in their impatience: The
time discount rates b1 and b2 are different. The smaller the discount rate, the more an
investor prefers consuming today to postponing consumption to the next period. If the
time preference rates are 1, there is no motivation for an investor to prefer early consumption to postponed consumption. The only asset to invest in the financial market is
a risk-free bond S, which they can exchange. If the two investors would also have the
same time preference rate, then they would be identical in all possible dimensions and
there would exist no interesting equilibrium since markets cannot clear.
A strategy in this economy consists of the consumption levels at the two dates and
the investment in the bond at the first date. Carrying out the individual optimization
determines the optimal consumption and investment (S) for the two investors. These
policies depend on the as yet exogenous given bond price S. Inserting these strategies
in the market clearing condition determines the endogenous price of the financial assets:
the risk-free interest rate follows from the interaction of the investors. This completes
the endogenous pricing of the bond.
3.1. MODERN ASSET PRICING AND PORTFOLIO THEORY
219
The optimal policies can be calculated explicitly in this model. If k is the number
of bonds that investor k buys at time 0 and keeps until time 1, market clearing means
that 1 + 2 = 0: what 1 sells (buys) must 2 buy (sell). Inserting the optimal investment
strategy functions, which depend on the unknown risk-free interest rate Rf , the market
clearing condition determines this equilibrium risk-free interest rate
Rf =
2(1 b1 b2 )
.
b1 + b2 + 2b1 b2
This shows that the time value of money is driven by the impatience of investors regarding their consumption. The following remarks can be derived from the above formula.
If impatience is zero, the two discount factors b are both equal to 1. Then the risk-free
rate is zero. The time value of money remains constant over time since no investor values
the present higher than the future. If the discount rates b approach zero, the risk-free
interest rate becomes unbounded. Such discount rates mean that equalizing the utility
of eating an apple today requires consuming an unlimited amount of apples at a future
date. To finance such an explosion in future consumption of goods, the price of a zero
bond that pays 1 at a future date must be arbitrarily small. If the two agents differ in
their preferences or if they face different endowments, then both the endowments and
the consumption preferences enter into the above equilibrium rate formula too.
The main insights from this example are: Asset prices and asset price returns are the
result of an economic interaction between investors.
Example
For every investor who buys one proposed stock there must be another investor
who sells the stock. Portfolio advice cannot apply to everyone. Who are the other
investors ? Consider the Chief Investment Officers (CIOs) that propose bets against a
passive investment in holding the market - this is the search for alpha.
The other investors should then be compensated for an opposite strategy. These
strategies cannot pay out well if markets do well. Such opposite strategies need for
example to compensate the other investors for heavy losses if markets perform badly.
They therefore act like insurance contracts. But do these other investors exist, and if so - what is their strategy and where can one find the marketing of the strategy? If
these other investors simply dont exist, the investment advice of the CIOs will not be
sustainable.
Example - SNB Policy
220
CHAPTER 3.
In January 2015 the Swiss National Bank (SNB) removed the floor value between
the euro and the Swiss franc. This floor had been introduced in August 2011 since
EUR/CHF had moved from more than 1.6 to close to parity value in three years. This
had proved to be a significant burden for the Swiss export industry since two-thirds
of exports are denominated in euros. In 2011, the floor was set to 1.2 up from around
1.1. When the floor was removed, the exchange rate fell within minutes from 1.2 to 0.9
and stabilized over the following days at around 1.05. If we consider the non-regulated
exchange rate to represent the equilibrium rate, the first intervention forced the rate to
move out of equilibrium, and then removing the floor the rate was allowed to return to
its equilibrium value.
Whatever the utility function of the SNB is, the role of market clearing conditions
shows their importance in this episode. If an agent in the economy - the SNB - wants
to move a value out of equilibrium, that agent then has to change the demand or supply
side. By buying euros, the SNB accepted that its balance sheet would grow, as it did from CHF 100 bn to almost CHF 500 bn in maintaining the floor.
Example - Logarithmic Utility

We assumed in the above equilibrium model that the preferences of the investors in
the above economy are logarithmic. Such preferences are often used since they facilitate
many calculations, but they are also specific from an investment behavior point of view.
Log investors always act myopically (one-period view). Their demand for hedging longterm risks is zero. What is the intuition for this particular behavior of log investors? A
log investor maximizes, by definition, log returns. Assuming normality of the returns,
the log return over a long time horizon is equal to the sum of one-step returns. Hence,
the long-term return is maximized if the sum over the one-period returns is maximized.
This shows that a log investor is always a short-run investor.
3.1.3
Relative Pricing
What we can learn about one assets value given the prices of assets under very weak
assumptions about markets, information and preferences? We take the underlying prices
as given, use as little information, and investors preferences matter only in the sense
that they prefer more to less money. The first such model was the Modigliani - Miller
(1958) approach to firm valuation.
But the true revolution in relative pricing was initiated by the option-pricing work of
Black and Scholes, published in 1973. Today, there is no option pricing method in the
markets that is not based on the no arbitrage paradigm. Contrary to absolute pricing
models, the weak preference structure define few data requirements for model calibration
3.2. ABSOLUTE PRICING: OPTIMAL ASSET PRICING EQUATION
221
compared to the absolute pricing models. The rest of the pricing theory is mathematical
logic and intuition about the modelling of asset dynamics.
Absolute pricing is general, requires a lot of data to be applicable, and often fails to
be precise. Relative pricing is much simpler, but it is limited in terms of the cases to
which it can be applied. Therefore, both approaches are often not applied in their pure
forms. In the CAPM absolute pricing approach the market risk premium is, for example,
not explained within the model. In relative pricing one often has to add some absolute
pricing elements if the no arbitrage principle is too weak to forecast a unique asset price.
3.2
Absolute Pricing: Optimal Asset Pricing Equation
The optimization problem of the rational investors read:

The investors derives expected utility from two-period consumption at the present
date t and a future date t + 1.
The investors chooses the investment to maximize expected utility. There is only
a single risky asset.
The investor faces two budget constraints at time t and t + 1. Consumption at time
t equals its endowment minus the amount of assets with price St . At time t + 1 the
same logic applies but the asset price is replaced by the future asset value - that is
to say, the payoff Xt+1 .
Markets clear.
The payoff is, for example, the expected return in the case of stocks, any option payoff
or the value of a stock including dividends. We note that in the definition of a factor
model, factors are not given as pay-offs. However, we discuss below that it is always
possible to replace a given set of pricing factors by a set of pay-offs that carries the same
pricing information.
Solving, mathematically, the investment problem, the fundamental asset pricing equation for asset S at time t follows.
St = Et (Mt+1 Xt+1 )
(3.1)
where M is the stochastic discount factor (SDF) and expectation is made based on
information at vista time t. Hence, price is expected discounted payoff.
What makes (3.1) an asset pricing theory is the underlying general equilibrium model,
which ensures that a single SDF exists, which can be used to price all assets by discounting the payoffs. (3.1) describes a market in equilibrium, after the investor has reached
his or her optimum consumption level.
222
CHAPTER 3.
The stochastic discount factor equals

Mt+1 = b
u0 (ct+1 )
u0 (ct )
(3.2)
with b the time preference rate and u0 (c) marginal utility of consumption.
This relationship between asset prices and consumption defines the primary goal
of asset management: investments proposed by asset managers should protect investors
optimal consumption in the short and long run. This is a difficult task. First, investments
based on consumption data of investors often underperform. Second, the knowledge of
investors preferences is still limited and static. New technologies are already helping to
overcome these difficulties and will be key to fulfilling the stated primary goal of asset
management in the future.
3.2.0.1
Good and Bad Times
Since consumption at time t + 1 is stochastic from vista time t, the discount factor Mt+1
is stochastic. The SDF is high if time t + 1 turns out to be a bad time - that is to say,
consumption is low in a specific future state, see Figure 3.1. Then future asset prices
are discounted weakly. Hence, the pricing equation (3.1) attributes to assets that pay
off well in bad times a high price. Contrarily, future payoffs are discounted heavily if
consumption is high in a future state. Then the SDF is small. In other words, the SDF
relates future payoffs to changes in consumption level by valuing the assets appropriately.
The ratio of marginal utilities entering the SDF reflects that investors value money
more when they need it - in bad times - than in good times. Investors therefore often
consider marginal utility as an index of bad times and the SDF, which describes the
substitution between present and future consumption, is then seen as an index of growth
in bad times.
One may derive the existence of an SDF without referring to individual optimization
of the consumption path. In general, the law of one price and the more restrictive notion
of no arbitrage is equivalent to the existence of a positive SDF. These existence theorems
in reduced relative pricing models - see Section 3.10 - are very general. But they do not
provide an explicit construction of the SDF, unlike a consumption-based model.
3.2.0.2
Examples
Example
Consider a simple discrete market model with n = 1, 2, ..., N future states. The
market is complete, that is, for each states n there exists a financial product (contingent
claim) which pays $1 in this state n and zero else. In such a market all risks can be
shared over time and in the cross section of states between all investors. We write Sc (n)
223
Figure 3.1: Utility and marginal utility of consumption.Since marginal utility u0 is a

decreasing function of consumption, in bad times where consumption is lower at the
future t + 1 than today at t, the ratio of marginal utilities in (3.2) is larger than one.
In good times, the opposite is true. Therefore, the SDF in bad times is larger than the
SDF in good times.
for the price today of this claim. The price S(X) of any payoff X is the simply equal to
the sum of payoff Xs value in all states times the price of the contingent claim in the
states:
N
X
S(X) =
Sc (n)X(n) .
n=1
Multiplying and dividing by the probability p(n) for each state:

S(X) =
N
X
p(n)
n=1
p(n)
Sc (n)X(n) =
N
X
p(n)M (n)X(n) = E[M X]
n=1
where the SDF M is given as the ratio of state price to probability for state. Hence, using
the complete, discrete market model we arrive at the same fundamental asset pricing
equation (3.1) but with a different definition of the SDF. But in fact, a mathematical
proof shows that the above definition of the SDF and the former one based on the
marginal utility of consumption ratio are the same. In such a complete market where all
investors agree about the probabilities, the ratio of SDF realizations across states is the
224
CHAPTER 3.
ratio of marginal utilities across states. In such a setup, there exists only one SDF which
is sufficient to price all assets.
If we consider a risk-less asset S0 that is, the payoff X(n) = 1 in all states, then
S0 = E(M ) .
Therefore, the risk-less rate Rf satisfies
1 + Rf =
1
1
=
.
S0
E(M )
If we define the so-called risk neutral probabilities

q(n) := (1 + Rf )Sc (n) =
M (n)
p(n).
E(M )
The fundamental asset pricing equation reads

S(X) =
N
X
1
1
q(k)X(k) =
E Q (X)
1 + Rf
1 + Rf
k=1
where the last expected value is with respect to the risk neutral probability. This
formula states that the price of any asset is equal to the expected discounted value of
the payoff using risk neutral probabilities. This representation is the essence of relative
pricing used mainly for derivatives. There, one constructs the risk neutral probabilities
such that the discounted prices are fair games which are then equivalent to the absence
of arbitrage, see Section 3.10 for details.
In an incomplete market where investors can still trade in all existing assets but
there are some risks which cannot be spanned by the assets, in the fundamental pricing
equation (3.1) the SDF is replaced by the orthogonal projections of the SDF on the
space of payoffs.
Summarizing, there exists a market structure setup which leads to the same fundamental asset pricing formula (3.1). The differences occur between complete and incomplete markets which are reflected in the consumption-investment model between equal
or different ratios of marginal utilities for different investors.
Example
We have shown that for the risk-less rate one pays USD 1 and one gets 1+Rf USD, i.e.
1
1
1 + Rf = E(M
) . Assuming a constant relative risk aversion utility function u(c) = c
225
with 0 < < 1, the SDF is given by

c
ct+1
ln t+1
ct
M =b
b(1 ct+1 )
= be
ct
using a Taylor approximation up to the first order where ct+1 = ln
expanding up to first order:
1 + Rf =
ct+1
ct
. Again
1
1
(1 + Et (ct+1 )) .
E(M )
b
This shows that interest rates are higher if people are impatient (low b) or if expected
consumption growth is high. Since high consumption growth means people get richer
in the future one has to offer high risk free rate such that they consume less now and save.
How much does Rf vary over time is the same to ask how much must one offer to
individuals to postpone consumption? This variation is given by the risk aversion factor
. More precisely, expanding the risk-free rate relation up to second order we get:
1
1
1 + Rf (1 + Et (ct+1 ) 2 t2 (ct+1 ) .
b
2
Therefore, higher consumption growth volatility lower interest rates which motivates
investors to save more in uncertain times.
Example
Consider zero-coupon bonds where St,t+1 is the price of a zero-coupon bond with one
year maturity. Since the bond pays $ 1 at maturity, X = 1 follows and the fundamental
pricing equation becomes
St,t+1 = Et (Mt+1 ) .
This shows that bond pricing is essentially constructing of discount factors and calculating the expected values. The more complex the model of the bond - liquidity risk,
counter party risk, etc. - the more complicated will the SDF be.
Example
We use the fundamental asset pricing equation to derive the cost of carry formula
for forward or futures contracts. The payoff X is the difference between the spot S and
futures f price. We have at maturity T :
XT = ST ft,T
226
CHAPTER 3.
where ft,T is the future price negotiated at time t for the delivery of one unit of some
asset (stock, bond, corn, gold, etc.) at time T . Therefore, (3.1) reads
St = Et (MT XT ) = Et (MT (ST ft,T )] .
The true price of the forward at-market when it is negotiated at time t is zero. Using
this and the fact that the future price is non-stochastic, we get:
Et (MT ST ) = Et (MT )ft,T .
At this stage we specify the type of asset S. We assume that the asset pays in each
period a quantity q per unit of asset and that it costs c per one unit of asset to store
the asset in each period. The fundamental pricing equation extension to many periods
N states:
T t
X
St =
Et (Mt+k (qt+k ct+k )) + Et (MT ST ) .
k=1
If we use the risk free rate rt,t+k in the period between t and t + k, then the SDF satisfies
Et (Mt+k ) =
1
1 + rt,t+k
and we arrive after some algebra at the result of the carry model:
ft,T = St (1 + rt,T qt,T + ct,T ),
where qt,T , ct,T are weighted rates.
3.2.0.3
Changes in Asset Prices
The price changes of S in the fundamental pricing equation (3.1) can be due to three
causes: Either the probability p changes or the discount factor M changes or the payoff
X changes.
While in the past the view was that expected returns were constant over time, it has
become clear now that expected return variation over time and across assets is much
larger than anyone anticipated. We also know that asset valuation moves far more on
news affecting the discount factor than on news of expected cash flows, that is, the payoff
X.
The changes in expectations (probabilities) are the main source of behavioral asset
pricing. If investors for example set their subjective probabilities equal to the objective
ones in the fundamental asset pricing formula, then investors expectations are wrong.

3.2.0.4
227
SDF without Consumption
If we neglect consumption, the conditional mean of the SDF becomes the inverse of the
gross, risk-less interest rate. Replacing the future payoff by the future asset price, the
fundamental equation then reads
St = Et (Dt+1 St+1 )
with D the discount factor. This states that the best guess of future discounted asset
price S, given the present information, is equal to the present discounted asset price
(note that we could insert Dt = 1 on the left-hand side). Therefore, the equation states
that the best guess of the future value stock vector Dt+1 St+1 is its present value.
3.2.0.5
Equivalent Formulation of the Fundamental Asset Pricing Equation
Equation (3.1) can be equivalently rewritten to derive the factor pricing models, such
as the CAPM, in the traditional form. In investment applications, one prefers to think
about rates of return instead of prices. Dividing (3.1) by the price, we get the equivalent
return formula
1 = Et (Mt+1 Rt+1 )
with R the gross return on the asset. Similarly, Re , the excess return over the risk-free
rate, we get
e
0 = Et (Mt+1 Rt+1
)
(3.3)
This equation also holds if we consider the gross returns of two risk assets. This equation
states that the excess return and the SDF are orhtogonal to each other. Therefore, the
expected return is the orthogonal projection (beta) of the return on the SDF or the beta
pricing model:
e
e
Et (Rt+1
) = PMt+1 (Rt+1
).
(3.4)
We next use the fact that the expected value of the two random variables M and R
is equal to the individual expectations plus the covariance term correction. Defining the
regression coefficient i = cov(M, Ri )/var(M ) and the variable = var(M )/E(M ), we
get the next equivalent equation to (3.1)
Et (Rie ) = i
(3.5)
The derivation does not need the assumptions of the CAPM; it holds generally. Note
that the beta is calculated in general relative to the SDF. In concrete models such as the
CAPM, the market return replaces the SDF in the beta calculation. The risky assets
risk premium is proportional to the covariance between its returns and the SDF (its systematic risk). All factor models such as the CAPM are particular cases of (3.5) where one
substitutes a series of factors for the general SDF. If the asset payoff is uncorrelated with
consumption (i = 0 in (3.5)), then the asset does not pay a risk premium, irrespective
of how volatile its returns are.
228
CHAPTER 3.
Rewriting equation (3.1), we can define systematic and idiosyncratic risk

St = Et (Mt+1 )Et (Xt+1 ) + covt (Mt+1 , Xt+1 ) .
(3.6)
We can decompose any payoff in a systematic and idiosyncratic component by running

a regression of the payoff on the SDF,
Xt+1 = Mt+1 +
| {z }
Systematic
t+1
|{z}
(3.7)
Idiosyncratic
Hence, asset prices are equal to a expected discounted cash flow plus a risk premium. Idiosyncratic risk is by definition the part that is not correlated with the SDF
and hence does not generate any premium - is not only what is commonly understood
as firm-specific risk. All equations hold true under the assumptions that an investor has
already chosen his or her portfolio and that the statements apply for an additional small
investment. For big asset purchases, however, portfolio variance can matter a lot. The
variance of the payoff will affect - in equilibrium - via the marginal utility, the SDF and
finally the risk premium.
Example
We reconsider the case with utility function u(c) = c1 with 0 < < 1. Inserting
the explicit utility function up to first order we get in (3.4)
e
e
Et (Rt+1
) = cov(Rt+1
, ct+1 ) = t2 (ct+1 )
|
{z
}
=
e , c
cov(Rt+1
t+1 )
.
2
t (ct+1 )
|
{z
}
(3.8)
If assets covary positively with consumption growth or equivalently negatively with the
SDF then they must pay a higher average return. High expected returns are equivalent to
low asset prices. From a risk perspective, the above equations state that average returns
are high if beta on the SDF or on consumption growth c is large. This is the above
bad times - low consumption growth - high SDF - high returns or high asset prices story.
Using the fundamental equation (3.1) with a risk free rate and using the approximation for the SDF we get:
St = Et (Mt+1 Xt+1 )
Et (Xt+1 )
cov(Xt+1 , ct+1 ) .
Rf
(3.9)
Again, price is higher if the asset payoff is a good hedge against consumption growth
(negative correlation).
229
Example
We reconsider the above zero-coupon bond pricing problem with the one-period pricing St,t+1 = Et (Mt+1 ). To value to the bond one needs a model for the discount factor
M. The simplest model is the discrete Vasicek model:
ln Mt+1 = zt + t+1 , zt+1 = (1 a)d + azt + 0t+1 .
The second equation states that the interest rate is a mean-reverting process with d the
long-term mean. Using this model in the ricing equation and calculating the expectations,
the yield y for the two-period zero coupon bond follows:
1
yt,t+2 = c1 + c1 yt,t+1 + cov(t+1 , 0t+1 ).
2
The last term represents the risk premium between the discount factor and the interest
rate shocks. This is the premium which the zero-coupon bond must pay since it payoff
moves up or down with the interest rate.
Example
The price of a forward satisfies:
0 = Et (MT XT ) = Et (MT (ST ft,T )) .
But this the usual orthogonality equation such that the forward rate is given by the
orthogonal projection:
ft,T = Et (ST ) + covt (MT , ST )Rf .
The forward price is therefore equal to the expected future spot price at time T plus a
risk premium. The risk free rate just discounts back the risk premium to time t.
What can be said about the sign of the covariance? Since the SDF is an indicator of
bad times but assets pay off well in good times, the covariance between them is typically
negative - in this case,
St < Et (Mt+1 )Et (Xt+1 ) .
(3.10)
This generates a risk premium and allows risky assets to pay more than the interest rate.
Setting X equal to the stock price S and writing St = St /Mt , (3.10) becomes
St < Et (St+1 ) .
(3.11)
Investors expect positive gross asset returns. Therefore, the asset price dynamics is not a
fair coin toss, where the best guess of tomorrows discounted asset price is todays price
230
CHAPTER 3.
- that is to say, St = Et (St+1 ). If asset price dynamics would be a fair coin toss then
returns would not be predictable and the price process would be a random walk. Contrarily, to generate risk premia, asset prices have to be predictable in the statistical sense.
Which assets are predictable? We consider this question below in the Efficent Market
Hypothesis (EMH) section.
Insurance investments show the opposite behavior to financial assets in equation (3.6):
A financial investments return is positive in good times and negative in bad times. Contrary, an insurance investments return is negative in good times but pays off well in bad
times. The covariance in equation (3.6) is positive. Therefore the value of the insurance
- that is to say, the left-hand side of (3.10), is larger than right-hand side.
3.2.1
Equivalence: Discount Factors, Risk Factors, and Mean-Variance

Model
We relate the general theory to the CAPM and the Markowitz mean-variance model.
3.2.1.1
CAPM
In the CAPM the SDF is linearly related to the market return RM :

Mt+1 = a + bRM,t+1 ,
(3.12)
with a, b some constants. Using this SDF, the usual CAPM formulation follows
E(Rj ) = Rf + j,M (E(RM ) Rf )
(3.13)
if the parameters a and b in the SDF are appropriately chosen. Hence, to derive the
CAPM, an affine function of the market return is sufficient to describe the SDF.
How good is this single indicator? If market returns go down, the SDF also falls
since the ratio of marginal utilities in (3.2) declines. This is equivalent to future consumption falling relative to present consumption. The other direction of this logic also
holds. Therefore, the specification (8) of the SDF leads to the right economic relationship
between consumption and market return. But there are other investors behaviors that
the single factor fails to capture. An investor would for example not spend money on
holidays if markets go down since in the CAPM no one can think that market fluctuations are temporary. Finally,all investors in a CAPM world hold the market portfolio.
Therefore, they all discount future cash flows by the same amount.
3.2.1.2
Markowitz Model
We set for the Markowitz model

Mt+1 = a + bRmv,t+1 ,
231
where Rmv,t+1 is any mean-variance efficient return. As for the CAPM, given any Rmv,t+1
and a risk-free rate, we find a SDF that prices all assets and vice versa.
3.2.1.3
Relationship between Factor Models and Beta Representations
It is worth to express the relationship between factor models and beta representations in
general since the expression of a risk premium given in (3.5) is of limited practical use
because it involves the unobservable SDF. The idea is to start with investable factors
and then derive the beta representation which will be equivalent to the SDF approach.
Definition 3.2.1. A K-factor model is quantified by M = a + b0 F where F is the Kdimensional vector of factors, a is a number and b is a vector of numbers. A factor Fk
that has a non-zero loading bk is said a pricing factor.
The equivalence between factor models and beta pricing models is given in the next
proposition.
Proposition 3.2.2. A scalar a and a vector b exist such that M = a + b0 F prices all
assets if and only if a scalar and a vector exist such the expected return of each asset
j is given by
E(Rj ) = + 0 j
(3.14)
where
=
1
1
cov(M, F ), =
.
E(M )
E(M ) 1
The K 1 vector j is the vector of multivariate regression coefficients of the return of

asset j on the risk factor vector F .
The vector is called the factor risk premia. The constant is the same for all
assets and it is equal to the risk-free rate if such a rate exists. We mentioned above that
factor models often are not given as pay-offs nor as returns, but the fundamental pricing
equation is expressed using pay-offs. It possible to replace a given set of pricing factors by
a set of pay-offs that carries the same information. The following proposition summarizes:
Proposition 3.2.3. Starting with a SDF in the factor model format M = a + b0 F , we
can always construct a new SDF M = a + b0 F where a and F are the constant a
mimicking and the factor F mimicking payoffs. These mimicking expressions depend on
the original factors and the payoff X as follow:
a = E(X)0 E(XX 0 )1 X , fk = E(Fk X)0 E(XX 0 )1 X, k = 1, ..., K.
Mimickingmeans that the new SDF is as close as possible chosen to match the payoff. Summarizing, there is no loss of generality from searching for pricing factors among
pay-offs.
232
CHAPTER 3.
Cochrane (2013) distinguishes between pricing factors and priced factors. Consider M = a + b0 F and the factor risk premia of Proposition 3.2.2. The coefficient b in
the SDF is the multivariate regression factor of the SDF on the factors. Each component
of the factor risk premia is proportional to the univariate beta of the SDF with respect
the corresponding factor. If b is non-zero for a given factor means that the factor adds
value in pricing the assets given all other factors - a pricing factor. If the component
of the factor risk premia is non-zero, then the factor is rewarded - a priced factor. The
two concepts are not equivalent except in the case where all factors are independent.
If the factors are themselves portfolio (excess) returns, then the factor risk premia
itself can be expressed as expected returns. If there is additionally a risk-free asset, then
the factor risk premium becomes the same as the risk premium in (3.5). Else, the factor
premium is the difference between the expected factor return and the zero-beta return.
If the factor portfolio is dollar-neutral, that is the price of the portfolio is initially zero,
then the factor risk premia is equal to the expected value of the factor. Dollar-neutral
factor portfolios are common if factors are constructed as long-short portfolios of asset.
Summarizing, theory shows that the three representations - discount factors, meanvariance frontiers, and beta representation - are all equivalent (see Cochrane [2007]).
They all carry the same information. Given one representation, the others can be found.
Economist prefer to use discount factors, finance academics prefer the mean-variance
language, and practitioners the beta or factor model expressions.
But there is bad news. Factors are related to consumption data entering the SDF.
While multi-factor models try to identify variables that are good indicators of bad vs
good times - such as market return, price/earnings ratios, the level of interest rates, or
the value of housing - the performance of these models often varies over time. The overall
difficulty is that the construction of the SDF by empirical risk factors is more an art than
a science. There is no constructive method that explains which risk factors approximate
the SDF in all possible future events reasonably well. From a practitioners perspective
this discussion might seem irrelevant since a factor model that performs well will do the
job. But thinking in this way would put the reputation of the asset management firm at
risk when the performance of the model was weak in a future period and no explanation
for such a weakness existed.
The following issues are discussed in the exercises: When is a risk factor a tautology,
when is this not the case? Given a factor f . How can one measure its risk premium? In
particular, how can we estimate whether the premium is different from zero and therefore,
f is a priced factor?
3.2.1.4
Choice of Risk Factors
This section discusses some theoretical recommendations for the choice of risk factors.
First, factors should explain common time variation in returns.
233
Assuming that there exist a risk-free rate rf and M = a + b0 F , then the definition of the
SDF implies for any asset k return rk :
b0 cov(rk , F ) = 1
E(rk )
.
1 + rf
For all assets earning a different expected return than the risk-free rate, the vector of
covariances between the risk factor and the assets return must be non-zero. Hence, regressing the returns on the candidate pricing factors, all assets should have a statistically
significant loading on at least one factor. This choice recommendation is model independent.
The next recommendation is based on the APT model. APT not only requires that
factors explain common variation in returns but the theory suggests that these factors
should also explain the time variation in individual returns. This ensures that the payoff and hence the price of an asset can be approximated as the pay-off of a portfolio of
factors. Therefore, the idiosyncratic terms should be as small as possible. Performing
a PCA, the largest eigenvalues follows and hence the main factors. But why is it a
meaningful approach to consider the largest eigenvalue only? The Eckart and Young
Theorem (1936) states:
Proposition 3.2.4. The best approximation of a N N positive definite symmetric
matrix by a matrix of lower rank K is obtained by keeping the largest K eigenvalues and
setting the N K other ones to zero.
Empirical work in the last two decades reveals that regardless of the exact method
used, a single factor is not sufficient to describe the movements of all individual stocks
and that the number of factors to describe the cross section of expected returns of the
assets is in the single digit or in a low two digit region.
These statements are represent to my knowledge what is at the moment widely accepted.
3.2.2
Multi-Period Asset Pricing and Multi-Risk-Factors Models
Two natural extensions of the above setup are models with many periods and models
with many risk factors.
3.2.2.1
Multi-Period Asset Pricing
We consider the extension to many periods first. (3.1) is then replaced by the expectation
over all future cash flows. If we consider equity with D = X the dividends, we get the
well-known dividend discount model of corporate finance replacing (3.1)
St =
X
j=1
Et (
1
Dt+j ),
(1 + R)j
(3.15)
234
CHAPTER 3.
with R the internal rate of return on expected dividends. Why is (3.15) true? One
rewrites the one-period equation (3.1) for many periods. Using the probabilistic law that
expectations of future expectations are simply todays expectations (the law of iterated
expectations), (3.15) follows.
Equation (3.15) is the fundamental value equation. For two stocks with the same
expected dividends but different prices, the stock with the lower price has to have a
higher expected return.
3.2.2.2
Multi-Factor Models
We consider the extension to several factors in many periods. The first model is Mertons
(1973) multi-factor inter-temporal CAPM (ICAPM). This model assumes:
Investors choose an optimal consumption path and an optimal investment portfolio
to maximize their lifetime expected utility.
Investors care about two types of risk factors - the market return RM and innovations Y .
Innovation factors describe changes in the investment opportunity or environment.
Such factors include changing volatilities, changing interest rates, or labor income.
An investment opportunity set is by definition equal to the set of all attainable portfolios. In the Markowitz model, the investment opportunity set consists of all efficient and
inefficient portfolios.
If the investment opportunity set changes over time, then variables Y other than
the market returns will drive the returns. The Fama and French factors are then variables which describe how investment opportunities defined by the CAPM market factor
change over time. Hence, innovation risk factors are key if one wants to improve the
cross-sectional return predictions of the CAPM.
How important are innovation risk factors in practice? Working without these factors trivializes human behavior and needs. All investors are for example jobless since
no labor income exists. Investors can handle the different risk sources that matter to
them by investing only in market risk. This is clearly an ineffective hedge. This, on the
one hand, leaves the investors with many un-hedged risks and on the other hand, the investors cannot participate in innovation factor investment to improve investment returns.
In other words, optimal investment decisions (see the next section) depend on the
details of the environment and an investors preferences. Intuitively, the possible change
of the investment opportunity set for investors is more important for longer-term investment horizons than for shorter ones since the deviations from a static opportunity set
235
can become larger if one considers longer time horizons. The solution of the ICAPM
model generalizes (3.5) to
St (Re ) = bM M + bI I = cov(Re , RM ) cov(Re , RI )
(3.16)
where is the average relative risk aversion of all investors and is the average aversion
to innovation risk. The mean excess returns are driven by covariance with the market
portfolio and covariance with each innovation risk factor. Only the first term in (3.16) is
mean-variance efficient, the total portfolio is no longer mean-variance efficient due to the
second term. Economically, the average investor is willing to give up some mean-variance
efficiency for a portfolio that better hedges innovation risk. The mutual fund theorem of
the Markowitz model, where only two efficient funds are needed to generate any other
efficient portfolio, generalizes to a K + 2 fund theorem if there are K innovation risk
sources. Investors will split their wealth between the tangency portfolio and K portfolios
for innovation risk. This result is the source of much portfolio advice from multi-factor
models, including the FF three-factor model.
Example
The empirical FF three-factor model (there is no theory for this model) is an example
with K = 2 innovation risk factors - SMB and HMB. Consider the FF equation (2.74)
in terms of returns. Comparing this with equation (3.16), the first term corresponds
to the market beta times the market excess returns. The other two terms in represent
the corrections of the market return and are summarized in (3.16) by the aversion-toinnovation-risk expression.
3.2.3
Low Volatility Strategies
Low-beta stocks outperform in many empirical studies high beta stocks and volatility negatively predicts equity returns (negative leverage effect), see Haugen and Heins (1975),
Ang et al. (2006), Baker et al. (2011), Frazzini and Pedersen (2014), Schneider et al.
(2016). These are the so-called beta- and volatility-based low risk anomalies.
Is there an explanation for these anomalies? Schneider et al. (2016) argue that taking
equity return skewness into consideration rationalize these anomalies. The model setup
generalizes the CAPM as follows. The SDF M in the CAPM is affine in the single risk
factor the market risk return, see equation (3.12). The model of Schneider et al. (2016)
uses the CAPM as an approximation and also allows for higher moments of the return
distribution. This leads to skew-adjusted betas which rationalize the anomalies.
The authors explicitly use the credit worthiness of the firms as the source for the
skewness in returns. Therefore, skewness is endogenous by incorporating credit risk. The
higher a firms credit risk, the more the CAPM overestimates the firms market risk, because it ignores the impact of skewness on asset prices (Schneider et al. (2016)). If one
236
CHAPTER 3.
benchmarks such returns against the CAPM they appear to be too low since the CAPM
fails to capture the skewness effect.
To motivate the model, we start with the general formula (3.5) next use the fact
that the expected value of the two random variables M and R is equal to the individual
expectations plus the covariance term correction. Defining the regression coefficient i =
cov(M, Ri )/var(M ) and the variable = var(M )/E(M ), we get the next equivalent
equation to (3.1)
cov(M, Ri ) (M )
Et (Rie ) =
.
(3.17)
(M ) E(M )
Schneider (2015), Kraus and Litzenberger (1976) and Harvey and Siddique (2000) define
the risk premium as the difference between the expected value of a derivative X based
on the historical probability and the expected value under a risk neutral probability Q:
Risk Premium = EtP (XT ) EtQ (XT ) .
(3.18)
The two probabilities P, Q which define the derivative risk premium can be related to
each other - the Radon-Nykodim L derivative (math), the state price density (economics),
likelihood ratio (econometrics) - formally written as
L=
dQ
dP
(3.19)
The density has the expected value E P [L] = 1.

Example State Price Density
Consider two states with probabilities P = ( 12 , 12 ) and Q = (1/3, 2/3). Then L1 in
state 1 is L1 = 1/3
1/2 and similar for the second state. Therefore,
1
1
E P (X) = p1 X1 + p2 X2 = (X1 + X2 ) = E Q [LX] = q1 L1 X1 + q2 L2 X2 = (X1 + X2 ) .
2
2
This technique can be used for all diffusion price processes and jump processes used to
model the dynamics of asset prices and hence derivatives.
Using M = L in (3.17) and the risk premia for the market risk return we get:
Et (Rie ) =
cov(L, Ri )
e
Et (RM
).
cov(L, RM )
(3.20)
The expected return on asseti is proportional to the expected excess return on the market, scaled by assets covariation ratio with the pricing kernel - the true beta. The state
price density L is not observable. The goal is to approximate L(R) := E P (L|R) as a
power series in R. How is this achieved? First, any L can be written as an infinite sum
237
where each term in the expansion is of the form coefficients basis vector and the set
of basis vector form an orthonormal basis. The coefficients depend on P, Q, i.e. the price
dynamics of the assets, and the risk aversion of the investor. Geometrically, the representation of L is equivalent to orthogonal projections of L on the powers of R. Using
a linear or quadratic representation of L in (3.20) changes the true beta into a CAPM
beta (linear case) or a skew-adjusted beta in the quadratic case. In other words, a firms
market risk also explicitly depends on how its stock reacts to extreme market situations
.. and whether its reaction is disproportionally strong or weak compared to the market
itself. A firm that performs comparably well ... .in such extreme market situations, has
a skew-adjusted beta that is lower relative to its CAPM beta. ... as emphasized by Kraus
and Litzenberger (1976) and Harvey and Siddique (2000), investors require comparably
lower expected equity returns for firms that are less coskewed with the market. Schneider
et al. (2016)
To incorporate time-varying skewness in the stock returns the authors consider corporate credit risk by using the Merton (1974) model. In this model, the asset price of a
firm dynamics follows as in the Black and Scholes model a geometric Brownian motion
and the equity value at maturity date is an European call option on the firm value with
strike equal to debt (which is a zero-coupon bond). For firms with high credit risk, the
increased probability to default is reflected in strong negative skew of the return distribution. The forward value of equity is then given by the expected value of the call
option discounted with the SDF M = L under P . This forward value then defines with
the call option value the firms i excess equity return Rie . The expected gross return
is then given by (3.20) where for the SDF the before mentioned linear and quadratic
approximation are used. For the linear case, the CAPM, the betas increase with credit
risk - asset volatility or the leverage - and the firm correlation to the market. Comparing
this beta with the skew-adjusted one it follows that the latter one is in general larger
than the CAPM one. The difference increases the higher credit risk is which means that
the firm become more and more an idiosyncratic risk factor and hence less connected
to the market the stronger the skew is. In this sense the CAPM approximation leads to
an overestimation of expected equity returns and the fact is growing with deteriorating
credit risk quality.
Schneider et al. (2106) consider their model implications for low risk anomalies.
The first is the so-called Betting-Against-Beta (BAB) strategy, see Frazzini and
Pedersen (2014), is based on the empirical observation that stocks with low CAPM betas
outperform high beta stocks. Hence, the investment strategy is to buy low beta stocks
and sell high beta stocks. More precisely, the BAB strategy goes long a portfolio of lowbeta stocks and short a portfolio of high-beta stocks. To reach a zero beta, the strategy
takes a larger long position than short position so that the overall strategy has a zero
beta. The strategy is financed with riskless borrowing. Frazzini and Pedersen (2014)
document that the BAB strategy produces significant profits across a variety of asset
markets. Indeed, the SML for US stocks is too flat relative to the standard CAPM while
238
CHAPTER 3.
using the CAPM with restricted borrowing the deviation is less large. Using a model
and empirical evidence from 20 international stock markets, Treasury bond markets,
credit markets, and futures markets Frazzini and Pederson (2014) tackle the following
questions:
How can an unconstrained arbitrageur exploit this effect, i.e., how do you bet against
beta?
What is the magnitude of this characteristic relative to the size, value, and momentum effects?
Is betting against beta rewarded in other countries and asset classes?
How does the return premium vary over time and in the cross section? Who bets
against beta? Frazzini and Pederson (2014)
They find that for all asset classes alphas and Sharpe ratios almost monotonically
decline in beta. Alphas are decreasing from low beta to high beta portfolios for US
equities, international equities, treasuries, credit indices by maturity, commodities and
foreign exchange rates.
Constructing the BAB factors within 20 stock markets they find for the US a Sharpe
ratio of 0.78 between 1926 and March 2012 which is twice as much as the value effect
and still 40% larger than momentum. The results for international assets are similar.
Furthermore, the authors state that BAB returns are consistent across countries,
time, within deciles sorted by size, and within deciles sorted by idiosyncratic risk and are
robust to a number of specifications. These consistent results suggest that coincidence or
data mining are unlikely explanations.
The BAP strategy is rationalized in the model of Schneider et al. (2016) as follows.
The CAPM betas increase for fixed credit risk (fixed volatilities and leverage) with the
firms correlation to the market: buy stocks with low and sell stocks with high correlation
to the market. The alpha of this strategy is, the excess expected return relative to market
covariance risk, is given by the firms expected return for the skewness. These typically
positive alphas increase with increasing credit risk. Summarizing, the BAB returns can
be directly related to the return skewness induced by credit risk. The authors then
perform an empirical analysis to support the theoretical findings.
3.2.4
What Happens if an Investment Strategy is Known to Everyone?
We follows Asness (2015) who considers the value risk factor - that is to say, bets that
cheap stock investments will beat expensive investments. Although it is obviously best
if you are the only one to know a strategy, it is not clear what happens to a successful
strategy if it becomes more and more widely known. Intuitively, at the beginning of a
strategy one faces true alpha. Once the strategy becomes more and more widely known
239
it may continue to work, but possibly no longer in its pure form but, rather - for example
- by tilting the strategy. Then, the pure alpha strategy is moving toward a beta strategy,
along with all its possible transition states including alternative risk premia, smart beta,
and many more.
A strategy can continue working, after it has become known to the public, for one
of two different reasons. The first reason is that the investor is receiving a rational risk
premium: The strategy exists in equilibrium. If the long (cheaper) stocks are more risky
than the short and more expensive stocks on a portfolio level that cannot be diversified
away, then it is rational that there is a persistent risk premium.
The second reason takes the form of a possible behavioral explanation: Investors
make - from a rational point of view - errors. The long stocks have a higher expected
return not because they are riskier, but because of these errors - the stocks are too cheap
and one earns a return if they return to their rational value. But the relative impact of
these explanations can vary over time. During the tech bubble of 19992000, cheap
value stocks - which typically are cheaper because they are riskier - were cheaper because
investors were making errors.
The two explanations will behave differently when a strategy becomes known. In the
rational model explanation, the value strategy will still work even it becomes known.
There is no reason that, in equilibrium, a strategy should disappear once it is known.
But the extent of the risk premium can indeed change if knowledge of a strategy changes,
simply since the supply and demand side for the assets change. In equilibrium, risk remains. Since the factor is linked to the index of bad times, the risk of the strategy should
not be considered primarily as a measure of small variations in the returns, but rather
as a measure of the pain in bad times. The equilibrium property conserves both the
expected return and the risk of the strategy.
In the behavioral explanation, the risk source is not systematically linked to the return in equilibrium. It is therefore very difficult to be convinced that this risk remains
stable over time. There is no systematic component - demand and supply - as in the
equilibrium model to guarantee that the risk premium will not go away.
Asness (2015) compares these different views using historical data. He works with
the Sharpe ratio. If a strategy has an impact on the risk premia if it becomes more
common, the Sharpe ratio is expected to fall, either because the excess return diminishes
or because the risk increases. With regard to the returns, one could argue that if the
value strategy becomes more popular, then the value spread between the long and short
sides of the strategy gets smaller. This spread measures how cheap the long portfolio
is versus the short portfolio. If more and more investors are investing in this strategy,
which means buying the long side and selling the short side, then both sides face a price
movement - long is bid up and short is bid down. This then reduces the value spread.
240
CHAPTER 3.
Asness (2015) provides empirical evidence for the value spread by using the FF approach for value factor construction. He calculates the ratio of the summed book-to-price
ratio of the cheapest one-third of large stocks over the BE/ME of the most expensive onethird of large stocks. Clearly, the cheaper stocks will always have a higher BE/ME than
the expensive stocks, but the interesting point is to compare how the ratio of large-cheap
over large-expensive changes over time as an approximation of the attractiveness of the
value strategy. The result, taking into consideration the last 60 years, is that the ratio
was very stable, with a 60-years median value of 4. There is no downward or upward
trend. The only two periods during which the ratio grew significantly - reaching a value
of 10 - correspond to the dot-com bubble and the oil crisis of 1973. This measurement
shows little evidence that the simple value strategy was arbitraged away in the last 60
years.
To analyze the risk dimension, the annualized, rolling, 60-month realized volatility
of the value strategy for the last 56 years is considered. Again, the technology-driven
dot-com bubble is the strongest outlier followed by the GFC and the 73 oil crisis. There
is again little evidence that the volatility of the strategy is steadily rising or falling. But
the attractiveness of a strategy is best measured by the in- and outflows of investment
in the strategy. Increasing inflows should, on a longer time scale, increase the return of
a strategy and the opposite holds if large outflows occur. This was not observed in the
above return analysis.
3.3
Absolute Pricing: Optimal Investment Strategy and Rebalancing
Investors are not only interested in the asset price dynamics in equilibrium; they are also
interested in the optimal investment strategies portfolio. Merton laid the foundations in
his works from 1969 and 1971 (Merton [1969, 1971]). The rational agents, as in the last
section, optimize their lifetime expected utility of consumption by choosing their optimal
consumption path and optimal investment portfolio.
The work of Merton triggered a myriad of academic papers, and more continue to
appear even today. These papers differ from one another in many respects, including:
Which innovation risk sources are considered.
How the agents differ in their preferences.
How long the investors time horizon for optimal investment is.
How much investors are allowed to differ in their preferences.
Whether or not uncertainty matters.
3.3. ABSOLUTE PRICING: OPTIMAL INVESTMENT STRATEGY AND REBALANCING241

Fortunately, for many models the optimal investment strategy weights are of the same
structural form. To state the structural form we first define some investment strategies:
Definition 3.3.1. A static strategy (buy and hold) is the choice of a portfolio at
initiation without changing the portfolio weights in the future. A rebalancing strategy
is a constant proportion trading strategy where the portfolio weights in the assets do not
vary. Myopic strategies are strategies that are independent of returns that are ahead
more than one period.
At time t, the optimal strategy (t), for most models, consists of two parts
(t) = Short-Term Weight + Opportunistic Weight
(3.21)
The short-term weight is also called the myopic investment demand and the opportunistic weight the hedging demand or long-term weight.
The underlying intuition for the structure of equation (3.21) is due to the Principle
of Optimality of R. Bellman, see Section 2.7.1.
3.3.1
General Rebalancing Facts
While the two terms in (3.21) differ substantially in their detailed form for different
models, certain facts hold in general. We d write (3.21) in more explicit form:
(t) = MPRRRA1 + (1 RRA1 )Y RIRA1
(3.22)
where:
The Market Price of Risk (MPR): MPR =
t rt
t2
is .
RRA1 the inverse relative risk aversion - the investors risk tolerance.
RIRA1 the inverse relative innovation risk aversion.
Y the hedge of innovation risk factors.
If the opportunity set is constant, Y = 0, then optimal investment is always equal
1
t
reflects the
to short-term or myopic investment. The myopic component tr
2 RRA
t
demand the risky asset due to its risk premium and it is directly proportional to the investors risk tolerance. This component is equal to a the optimal solution of a one-period
model which motivates to call it the myopic component. If the expected return is larger
than the risk free rate it follows that an investor will be invested in the risky asset. The
second component (1 RRA1 )Y RIRA1 , which is also called the intertemporal hedging demand, represents desire to hedge against future changes in the opportunities. The
investors preferences to care with this hedging demand and the way how the investor
uses information to form expectations about evolution of the investment opportunity set
242
CHAPTER 3.
lead to substantially different forms of this investment component.

Equation (3.22) shows how the asset allocation should be tactically managed through
time. In particular, the allocation should change over time if some parameters change.
We therefore define
Definition 3.3.2. The expression in equation (3.22) defines the theoretical TAA.
The myopic part of the optimal investment rule is the one-period TAA which is
frequently used in practice, see Section 2.8.5. The main reason of not considering the
intertemporal hedging demand part is complexity and uncertainty.
We comment on this basic optimal strategy formula (3.22).
First, the optimal investment strategy is time-varying. In general, the rebalancing of
the portfolios is optimal. Do investors rebalance (enough)?
Second, rebalancing is countercyclical if an investor wants to maintain fixed portfolio
weights. Consider a single risky and a single risk-free asset and that an investor wants to
keep a fixed (60/40) portfolio. Hence, (t) is the fraction of wealth invested in the risky
asset and 1 (t) the fraction invested in the risk-less one. Assume that the investors
preferences and environment only lead to a short-term weight in (3.21). This weight is
proportional to the market price of risk. The investor started in a period with the
required (60/40) portfolio. If the risky asset performed well in the previous period, the
market price of risk takes a large value (large returns; low volatility). Hence the risky
assets weight increases and the risk-free assets weight decreases before rebalancing. Rebalancing then means that the investor sells the risky asset and buys the risk-free one.
Third, even if transaction costs are considered, rebalancing remains optimal. Only
the frequency and the strength of the rebalancing change. While in the past transaction
costs were significant even for liquid instruments, in these days the potential economic
loss incurred by not rebalancing a position outweighs the transactions costs.
Fourth, which of the two components in (3.21) is more important for investors? Some
academic authors state that the opportunistic weight can even be twice as large as the
short-term weight. Others deliver much smaller estimates for the hedging demand. The
size of the opportunistic weight is driven by two factors: predictability and investment
opportunity. The closer asset returns are to being predictable or the less the investors
consider their stochastic opportunity set variations, the lower is the opportunity component. In the extreme case where returns are not predictable or stochastic opportunities
are not changing over time or the investor has a logarithmic utility function (academic
case), then the long-term investment strategy part in (3.21) vanishes and it is optimally to
invest myopically. If investment opportunity sets vary over time, a repeated application
of one-period optimal portfolios is different from the optimal portfolio which considers

multiple periods: The long-term optimal weights not only consider to be mean-variance
efficient but also to provide a hedge against the changing investment opportunity set.
Fifth, the parameters in the MPR are in general time dependent. If the expected
return is larger than the risk-free rate or if the risky assets volatility decreases, invest
more in the risky asset. If the risky assets expected return is low enough or even negative,
go short on the risky asset and use the money raised to invest more than 100 percent
of the capital in the risk-free asset.If there is more than one risky asset, MPR keeps its
form but the division by the variance is replaced by a multiplication with the information
matrix
MPR = C 1 (t rt ).
(3.23)
Comparing this with the solution of the Markowitz problem (2.56), = 1 C 1 , where
there is no risk-free asset, shows that the first component of the optimal investment
strategy (3.21) defines a mean-variance efficient portfolio.
The MPR is proportional to the Sharpe ratio. The difference is the use of the risk
measure stick (vol versus variance). In this sense, portfolio theory with the seminal work
of Merton without innovation risk factors rationalizes the Sharpe ratio and the Markowitz
model to many period investing.
Sixth, the inverse relative risk aversion measures the curvature of the utility function as a function of wealth: If the investor is risk neutral, RRA1 equals 1. The more
risk averse an investor is, the lower RRA1 is and the more is optimally invested in the
risk-free asset. The notion of relative risk aversion raises two delicate issues. First, there
is a calibration result by Rabin (2000) that shows that expected-utility theory is an utterly implausible explanation for appreciable risk aversion over modest stakes. That is,
the only explanation for risk aversion in expected-utility by the curvature of the utility
function leads to non-reasonable results. Second, the measurement of RRA is, in itself,
a delicate matter.
Seventh, the opportunistic weight consists of three different terms: First, the expression 1 RRA1 is meant literally in the sense that in some models if the investor is
getting more risk averse, RRA1 decreases, then the myopic component in the optimal
portfolio becomes less important whereas the long-term or opportunistic weight is attributed more weight. Second, the aversion to innovation risk sources. Third, a hedging
demand against innovation risk. This is proportional to cov(Re , RI ) in (3.21) - that is to
say, the hedging demand follows from the correlation pattern of the innovations portfolio
return with the overall portfolio return.
Investors will increase their holding of the risky asset given by the first term if it
covaries negatively with state variables, that matter in the value function to the investor.
A bond is such a hedge against falling interest rates.
244
CHAPTER 3.
3.3.2
Convex and Concave Strategies
We compare three strategies:

Do nothing (buy-and-hold) [Assuming that in (3.22) all parameters are constant];
Buy falling stocks, sell rising ones (constant-mix strategies) [Contrarian view to
the myopic part of (3.22)];
Sell falling stocks, buy rising ones (portfolio insurance strategies) [Assuming that
the myopic part of (3.22) holds].
We follow Perold and Sharpe (1988) and Dangel et al. (2015). They consider buy-andhold, constant mix (say 60/40 strategies), constant-proportion portfolio insurance and
option-based portfolio insurance. We focus first on the first two strategies where we consider a risky stock (market) S and a risk free asset B. In their payoff diagrams the value
of the assets is a function of the value of the stock and in their exposure diagrams the
relation between dollar invested in stocks to the total assets is calculated. This illustrates
the underlying decision rule.
The payoff diagram for the 60/40 rule is a straight line with a slope of 0.6, that the
maximum loss is 60% of the initial investment (the investment in the risk free asset or
the inception value of total assets if the stocks are worthless) and the upside is unlimited,
see Figure 3.2. The exposure diagram is also a straight line in the space where the value
of the assets are related to the position or weight in stocks. If we consider a buy and
hold strategy, then the slope here is 1 and the line intersects the x-axis of the value of
the assets at the dollar value of 40 USD. If the portfolio is less than 40 USD, then the
demand to invest in the stock is zero. For a constant mix strategy, the slope becomes
0.6 in the exposure profile and the intersection point is at (0, 0). Hence, investor with a
constant mix strategy will hold stocks at all levels.
If there is no volatility in the market, either stocks rise or fall forever, then the buyand-hold payoff always dominates the constant mix portfolio. But with volatile markets,
the statement is no longer true. The success of the strategy depends on the paths of asset
prices. Since rebalancing is the same as a short volatility strategy, see Section 3.3.4, a
constant mix portfolio tends to be s superior strategy if markets show reversal behavior
than trends. If the trends dominate, then buy-and-hold is superior.
The performance of rebalancing depends on the investment environment: different
economic and financial market periods lead to different results. Ang (2013) compares the
period 1926-1940 with the period 1990-2011. He compares buy-and-hold investments in
US equities / US Treasury bonds and pure investments in the two asset classes with the
rebalanced (60/40) investment portfolio in the two assets. As a result, the countercyclical
behavior of rebalancing smoothes the individual asset returns. It leads to much lower

60/40
Buy-and-hold
Value of assets
Value of assets
60/40
Buy-and-hold
40
Stock value
60/40
Constant mix
Stock value
60/40
Buy-and-hold
40
Value of assets
Value of assets
Weight stocks
60/40
Buy-and-hold
60/40
Constant mix
Zero volatility
Volatile stocks
60/40
Constant mix
Stock value
Figure 3.2: Payoff and exposure diagrams for constant mix and buy-and-hold strategies
(Adapted from Perold and Sharpe [1988]). The left panels shows the payoff diagram for
the 60/40 buy and hold strategy and the exposure diagrams for the 60/40 strategy once
buy-and-hold or dynamic, that is assuming a constant mix. The upper right panel shows
the superiority of the buy-and-hold strategy when there are only trends and the lower
diagram shows that constant mix strategy can dominate the buy-and-hold one if there
is volatility depending on the stock asset path which is represented by the thickness of
the asset value line.
losses after the stock market crash in 1929 but it was not able to follow the strong stock
markets before the crash when compared to the static strategy. The rebalancing strategy
also leads to much less volatile performance than the single asset or bond strategy.
We consider next the third alternative - portfolio insurance. Maximizing expected

return with constant absolute risk aversion implies that optimal static sharing rules are
linear in the investments payoff: It is optimal to hold a certain fraction of a risky investment rather than negotiating contracts with nonlinear payoffs. This also holds in
some dynamic models such as the basic Merton model (1971). If investment opportunity
sets are not changing, the proportions of risky and risk-free assets are kept unchanged
over time. But this requires portfolio rebalancing: Buying/selling the risky asset when
it decreases/increases in value and selling it with increasing prices - the constant mix
strategy holds. Theoretically, with this strategy investors invest in risky assets even in
market stress situations. In practice, contrary, there is a strong demand for portfolio
246
CHAPTER 3.
insurance since investors have a considerable downside-risk aversion. Therefore, a rebalancing method opposite to the constant mix is required: selling stocks as they fall.
Returning to the tree alternatives - do nothing, buy (sell) stocks as they fall (rise),
sell (buy) stocks as the fall (rise) - the payoff of the strategies are linear, concave or
convex. The last strategy is called convex since the paoyff function is increasing with an
increasing rate if the stock values increase. Therefore the kind of rebalancing has itself
an impact on the payoff without making reference to a specific decision rule. Concave
strategies such as the constant mix strategies are the mirror image of the convex strategies such as portfolio insurance ones. The buyer of one strategy is also the seller of the
other one.
Summarizing, buying stocks as they fall leads to concave payoff curves which are good
strategies in market with no clear trend since the principle buy low, sell high applies. In
markets under stress, the losses are aggravated since more and more assets are bought.
The convex payoff of portfolio insurance strategies limits the losses in stressed markets
while keeping he upside intact. If markets oscillate, their performance is poor.
There are many ways to construct convex payoffs:
Stop-loss strategies. The investor sets a minimum wealth target or floor that must
be exceeded by the portfolio value at the investment horizon. This strategy is
simple but once the loss is triggered the portfolio will no longer be invested in the
risky asset and hence participation in a risky asset recovery is not possible.
In the option based approach one buys a protective put option. While simple,
this strategy has several drawbacks. First, it act against many investors behavior
that one should buy portfolio insurance when it is cheap - stock markets boom.
Second, buying an option at the money is expansive compared to the expected
risky asset return and since one has to roll the strategy costs multiple. Therefore,
such option based strategies are often used in long-short combinations (buying
out-of-the-money put and sell an out-of-the-money call).
Constant Proportion Portfolio Insurance (CPPI). This strategy is a simpler version
of the protective put strategy.
3.3.3
Do Investors Rebalance (Enough)?
Roughly, the studies that consider this question report that most investors rebalance too
infrequently, and - if they do rebalance - the rebalancing amount can be far from optimal.
Further, there seem to be cultural as well investor segmentation differences.
Brunnermeier and Nagel (2008) report that for US households the dominant motivation is inertia and not rebalancing. Calvet et al. (2009) found, on the other hand,
that Swedish households show a strong propensity to rebalance. But institutional investors too can fail to rebalance optimally. CalPERS, the California Public Employees

Retirement System, invested - during the Great Financial Crisis (GFC), 2008 to 2009 pro-cyclically rather than anti-cyclically. As a result, its equity portfolio of USD 100 bn
in 2007 had lost 62 percent of its value by 2009.
3.3.4
Rebalancing = Short Volatility Strategy
We show that rebalancing is the same as a short volatility trading strategy. A short
volatility strategy means here that the investors sell out-of-the-money call and put options. Since the price of an option is in 1 : 1 relation with the volatility, shorting a call
is the same as shorting volatility. We follow Ang (2013). The example considers a single
risky asset and a risk-free bond that pays 10 percent each period in a two-period binomial
model. The stock starts with a value of 1 and can go up or down in each period with the
same probability of 50 percent (see the data in Figure 3.3). If an up state is realized, the
stock value doubles; otherwise the stock loses half of its value.
Using these assumptions, wealth projections for the buy-and-hold strategy follow at
once. The value in the node up - up - that is, 2.884 follows from
2.884 = 1.64(0.7317 2 + 0.2683 1.1),
where 1.64 is the wealth level of the former period node; 2 and 1.1 are the returns of the
risky asset (up) and the risk-free asset, respectively; and 0.7317 = 0.6 2/1.64 is the
holding in equity after the first period. The rebalancing dynamics are calculated in the
same way but with fixed proportions in the two assets.
The payoffs after period 2 show that rebalancing adds more value to the sideways
paths but less value to the extremes (up - up or down - down) compared to the buy-andhold strategy. This transforms the linear strategy of buy-and-hold - that is, payoff is a
linear function of the stock value, in a non-linear way. Precisely, consider a European
call option with a strike value 3.676 at time 2 and a European put option with a strike
of 0.466. The option prices at date0 and date 1 follow from no-arbitrage pricing.
Consider the following two strategies:
A rebalancing strategy.
A short call + short put + long bond + long buy-and-hold strategy. The first two
positions are the short volatility strategy.
A calculation - see Ang (2013) - shows that:
Both strategies start with the same value 1 at time 0.
Both strategies attain the same values in all 3 states at time 2.
Therefore the two strategies are identical. This shows that a short volatility strategy,
financed by bonds and the buy-and-hold strategy, is the same as a rebalancing strategy.
248
CHAPTER 3.
Stock dynamics
Buy-and-hold wealth dynamics
Rebalancing wealth dynamics
Figure 3.3: Rebalancing as a short volatility strategy in a binomial tree model. Left are
the risky assets dynamics, in the middle are the wealth values if a buy-and-hold strategy
(60/40) is used, and right are the wealth levels for a rebalancing strategy to fixed (60/40)
weights. Note that up and down is the same as down and up. Therefore, there are two
paths for the stock value after period 2, both with the result of 1 (Ang [2013]).
Since volatility is a risk factor and rebalancing means short volatility, the investor automatically earns the volatility risk premium. The short volatility strategy makes the
payoff in the center of the probability distribution larger at the costs of the extreme
payoffs. Short volatility or rebalancing underperforms buy-and-hold strategies if markets
are either booming or crashing, but it performs well if markets are showing time reversals.
3.3.5
Rebalancing: A Source for Portfolio Return?
Does portfolio rebalancing generate alpha? It exists considerable confusion about the
possible answers to this question. One reason is due to the difference between geometric returns and arithmetic returns. We refer to Hallerbach (2014), Blitz (2015), Hayley

(2015), White (2015) and Quian (2014). Consider a risk free asset which pays for simplicity zero interest and a risky asset St which follows a lognormal diffusions process (the
Black-Scholes setup) with a periodic drift and constant variance 2 . Compare the
buy-and-hold strategy (BH) with the rebalancing strategy (RB) where the weights are
kept fixed at each date. We write for the fixed proportion of wealth invested in the
risky asset and 1 for the risk-free asset. Calculating the terminal wealth at time T
for the two strategies implies:1
VTRB = (ST ) e
(1) 2
2
, VTBH = (1 ) + ST .
(3.24)
The rebalanced terminal wealth is maximized for = 12 . This implies

ln
VTRB
VTBH
1
= (1 ) 2 T + ln
2
(ST )
(1 ) + ST

.
(3.25)
This shows that relative wealth depends positively on volatility, the lenght of the time
horizon and the stochastic path of the risky asset prices. The fact that volatility positively affects the terminal wealth value under rebalancing leads to so-called volatility
harvesting strategies. We stress that trading strategies which generate growth through
rebalancing do require specific market dynamics to persist. They are conceptually not
different from a simple directional trade since we bet on market dynamics rather than
market direction. If we fail to bet on the right dynamics, losses follow which means that
volatility harvesting is not an arbitrage, see the remarks at the end of last section.
3.3.5.1
Rebalancing, Volatility Drag
The reason for the wealth growth difference is the so-called volatility drag which can
be understood in the relation between arithmetic (AM) and geometric means (GM) or
between simple and compounding rates. Consider a one period investment of USD 1
with final value 1 + r with r the stochastic growth rate. The expected geometric mean
(GM) is approximated by the expected arithmetic mean (AM) minus half of the average
The solution of the dynamics dSt /St = rdt + dWt , S0 = s, with W the standard Brownian motion
1 2
is St = se(r 2 )t+Wt .
250
CHAPTER 3.
variance:2
2
.
(3.26)
2
Thus expected GM decreases with increasing asset volatility since GM is a concave function of terminal wealth. This effect is weaker for BH portfolios. But portfolio rebalancing
boosts expected terminal wealth if the autocorrelation in relative asset returns is negative
- buy low, sell high. There is a third effect if several assets are considered - the return
effect due to differences in asset returns.
E(GM) E(AM)
Portfolio rebalancing or volatility harvesting is an answer to the question what the

impact of periodic rebalancing on the growth rate of a portfolio means. There exists a
confusion about the terminology. Terms such as diversification return (Booth and Fama
(1992)) or rebalancing premium (Bouchey et al. (2012)) are used interchangeably for
the growth rate that a rebalanced portfolio can earn in excess of a buy-and-hold portfolio
and also refer to causes such as diversification which are neither necessary nor sufficient
to describe the growth rate. Finally, Hallerbach (2014): ..., the literature is also confused
in specifying this excess growth rate from rebalancing.
3.3.5.2
Rebalancing, Volatility and Dispersion Return
We follow Hallerbach to analyze the full return from rebalancing and its decomposition
into a volatility return and a dispersion discount.
Consider a portfolio with value Vp (t) at time t, weight i (t) = VVpi (t)
(t) for asset i and
asset i return ri (t) in period t. The weight in period t + 1 is with rebalancing
i (t + 1) =
1 + ri (t)
i (t) .
1 + rp (t)
Hence, RB to initial weights implies selling (buying) assets that realized returns above
(below) the portfolio return. When we do not rebalance, the weight are

i (t + 1) =
2
1 + gi (t)
1 + gp (t)
t+1
i (0)
GM for T periods reads:

GM =
TY
1
!1/T
(1 + Rk )
1 .
k=0
Taking logarithm, log(1 + GM) =
1
T
TP
1
log(1 + Rk ). Expanding the logarithm around 1 up to second
k=0
order implies

T 1
var(R) + (E(r))2
1 X
Rk2
log(1 + GM)
Rk
= E(R)
.
T
2
2
k=0
The first and the third term are the first two Taylor series expressions of log(1 + E(r)) which proves the
claim.

where gi is the growth rate of asset i and gp of the BH portfolio, since Vi (t) = Vi (0)(1+gi )t
holds for a buy-and-hold portfolio with constant growth rate. Hence, if there a crosssection variation in growth rates, the security with the highest growth rate will dominates
the portfolio Therefore, a BH portfolio growth rate is triggered by portfolio concentration
whereas RB periodically counteracts this concentration force. This reduces the portfolios growth rate - the dispersion discount.
P
Consider the portfolio return rp (t) = i i (t)ri (t) in period t where the sum is over all
N assets and the weights add up to one and are all non-negative (no short selling). When
the rebalancing period matches the return measurement periods and when the portfolio is
rebalanced in each period to its starting value, the arithmetic mean rebalanced portfolio
return and the variance read
X
2
E(rp,RB ) =
i (0)E(ri ) , p,RB
= h0 , C, 0 i .
(3.27)
i
Inserting these expressions in the approximation of the geometric mean (3.26), which
holds for a single asset and a portfolio, we get the approximation for the growth rate of
the rebalanced portfolio gp,RB :
!
X
1 X
2
2
gp,RB
i (0)gi +
i (0)i p,RB .
(3.28)
2
i
The volatility return is defined as thePdifference between gp,RB and the weighted average
of the securities growth rates, g := i i (0)gi , which implies
!
1 X
2
vol return := gp,RB g
i (0)i2 p,RB
(3.29)
2
i
where this difference generates the additional growth rate due to the volatility pumping in
the rebalanced portfolio. The following statements are immediate or follow from Jensens
inequality.
Proposition 3.3.3. Consider the volatility return in (3.29). If there is risk and not all
volatilities are the same, then the volatility return is positive. Ceteris paribus, a pairwise
correlations decrease (greater diversification) or a rebalancing frequency increase or a
negative return autocorrelation all increase the volatility return.
If the portfolio weights are chose to be equally weighted, the pairwise correlation is
the same, the volatility return reads
vol return

1N 1 2
2
(1 ) + CS
2 N
(3.30)
2
where 2 is the securities average volatility and CS
is the cross-sectional variance of
these volatilities. The statements of the last proposition can be directly read-off from
252
CHAPTER 3.
this expression.
To derive the dispersion discount, one starts with a BH portfolio and its compounded
return over T period:
(1 + gp,BH )T =
i (0)(1 + gi )T (
i (0)(1 + gi ))T = (1 + g)T
(3.31)
where we used Jensens inequality. Therefore, the BH growth rate is never lower than
the weighted average of the securities growth rates. The difference
1
disp discount := gp,BH g (T 1) 20 (g)
2
(3.32)
defines the dispersion discount. 20 (g) is the weighted variance of the securities growth
rates around their weighted average with the initial portfolio weights i (0) as weights.
Using this expression, the following statements are immediate.
Proposition 3.3.4. Consider the dispersion discount in (3.32). The discount is always
positive, the larger T or the larger the cross-sectional variance 20 (g), the larger is the
discount.
The rebalancing return is then defined as the difference between the volatility return
and the dispersion discount. Since both terms are positive the sign of the rebalancing
return is ambigous. In other words, volatility harvesting is a risky strategy and there
is, as we stated at the beginning no free lunch since volatility harvesting is a bet on the
dynamics of the portfolio.
Empirical results confirm the theoretical results. The rebalancing return is sometimes
positive and sometimes negative. We conclude that rebalancing is neither theoretically
nor empiricaly a reliable source of return.
3.3.5.3
Rebalancing, Leverage
We considered in Section 2.1.3 the impact of leverage on returns. Formula (2.16) summarizes that the expected return of a leverage portfolio also contains a covariance reduction
term between the random leverage ratio and the excess return. A different return impact are trading costs. Summing all factor in a multi period investment, there are three
factors which matter:
The covariance correction which is only present in leverage portfolios.
The volatility drag which is present in any multi-investment.
Transaction costs.
3.4. SHORT-TERM VERSUS LONG-TERM INVESTMENT HORIZONS
253
Anderson et al. (2014) consider these three factors in a 60/40 target volatility investment
as follows. There are two assets, US equity and US Treasury bonds. The authors consider monthly returns from Jan 1929 to Dec 2012. The target volatility is set equal the
fixed 11.59% realized volatility in the observation period. Since volatility is not known ex
ante, the leverage ratio is a random variable. The borrowing for the leverage is done at
the 3m Eurodollar deposit rate and trading costs are proportional to the traded volume.
The authors find that the magnified source return in equation (2.16), this means
the leveraged return without considering the covariance correction dominates all other
portfolio. This portfolio is not realizable in reality. The gross return of the source
portfolio - the risk parity portfolio with 60/40 target (gross of trading costs) and 3m
Eurodollar financing (net of trading costs) is 5.75% in the period. The magnified source
term adds another 9.72% which implies that 3.97% is due to the leverage and excess
borrowing return. The total levered arithmetic return is 6.84% which means that the
covariance correction is 1.84% and the trading costs of 1.04% are subtracted. Finally,
the variance drag value is 0.4% which implies the total geometric levered return of
6.37%. Summarizing, the three effects - transaction costs, covariance correction and
variance drag - reduced the positive leverage return impact of 3.97% by 82% to 0.69%.
3.4
Short-Term versus Long-Term Investment Horizons
This section draws on Campbell and Viceira (2002).
3.4.1
Questions and Observations
The theoretical setup allows us to discuss the following relevant practical questions or
observations:
Financial planners often recommend investors with a long investment horizon to
take more risks than they recommend for the older investors. Is this always rational
advice?
Conservative investors are advised to hold more bonds relative to stocks than are
aggressive investors. This contrasts the constant bond - stock ratio in the tangency
portfolio of the CAPM model. This is called the asset allocation puzzle.
Judgement of risk may be different for long-term and short-term investors. Cash
- risk free for the short term - becomes riskier in the longer-term since it must, at
some point, be reinvested, but at an uncertain level of real interest rates.
3.4.2
Short-Term versus Long-Term Investments in the Great Financial Crisis (GFC)
Consider an investor with a relative risk aversion of 2, a normal market return of 6% in

stocks, a risk free rate of 2% and a volatility of 18%. The investors assume that returns
254
CHAPTER 3.
are IID, i.e. he is a myopic investor. Then, the optimal portfolio formula (3.22) consists
only of the first term: = 0.060.02
= 0.6. Therefore, the investor holds 60% in equities
0.18
and the other 40% in a risk-less asset. In the GFC, volatility (both the realized and
the option implicit one) increased to 70%. Then the optimal myopic formula implies
= 0.04. This means, a 4% equity position or a reduction by 93% from the pre-crisis
investment. But stock markets were not down by 93% and since the average investor
holds the market, the average investor did not show the same panic as our investor above
does. More important, the assumption of IID returns is not helpful if one considers such
volatility jumps. But if one allows for non-IID returns, the second term in the optimal
investment formula matters. Then, stocks turn out to be a good hedge against its own
stock state variable, see below.
3.4.3
Time-Varying Investment Opportunities
When investment opportunities vary, optimal, long-term portfolio choice is different from
myopic portfolio choice. Investment opportunities can vary can vary because market factors doe so (nterest rates, volatility, and risk premia) or because non-market factors vary
(labor income).
We consider the case of time-varying short-term interest rates. The investor with
constant relative risk aversion maximizes his or her consumption paths by investing in a
single risky equity asset and the risky short-term rate asset. The model assumes that the
time-varying short-term interest rate shapes the opportunity set The optimal investment
in the risky asset as given in (3.21) takes the form
(t) =
t rt
cov(IRt+1 , Exp.IR
RRA1 + (1 RRA1 )
2
t
t2
(3.33)
with IRt+1 the short-term interest rate at time t + 1 and ExpIR the expected future
interest rates. The myopic term is the risk premium of the short-term interest rate and
the second term represents the dynamic inter-temporal hedging demand.
If the interest rate return is IID, then the expected future interest rate is zero. The
optimal strategy is then equal to the myopic one. Assume that return are not IID. If
the investor becomes more risk averse, RRA1 tends to zero. Therefore a conservative
investor will not invest in the risky asset for its risk premium but rather will fully hedge
the future risk of the risky asset. Hence, short-term market funds are not a risk-less asset
for a long-term investor. Campbell and Viceira (2002) show that the risk-less asset is in
this case an inflation-indexed perpetuity or consol.
These authors also consider the two-asset innovation case (equity and interest rate)
and they calibrate the model to US data using data on nominal interest rates, inflation
and equities. The sample 1y and 10y nominal bond premia, the equity premia, and their
Sharpe ratios for the period 1952 - 1999 are given in the following Table 3.3.
Nominal bond premium

Nominal bond standard deviation
Bond Sharpe ratio
Equity premium
Equity standard deviation
Equity Sharpe ratio
1y
0.4
1.57
0.26
7.6
16.03
0.48
255
10 y
1.24
11.22
0.11
-
Table 3.1: Yearly premia, standard deviations, and Sharpe ratios for the 1 y and 10 y
US term structure and US equities (Campbell and Viceira [2002]).
Using these data and the optimal portfolio rule, the optimal portfolio weights using
the multi-dimensional generalization of (16) can be calculated for different degrees of risk
aversion. We summarize the results for unconstrained investors:
Investors with a low risk aversion degree invest leveraged both in equity and the
index bonds. If risk aversion increases, investment in equity reduces faster than
does investment in bonds.
For the most risk-averse investor, optimal investment almost equals bond investment. Cash does not play a significant role - the index bonds are the appropriate
safe assets when investment opportunities in interest rates are time varying. Since
oney market instruments need to be rolled for long-term investors, they are not
risk-less.
The results are inconsistent with the mutual fund theorem of the Markowitz model,
since risk aversion only affects the ratio between cash and the risky assets but not
the relative weights of the risky assets as is the case in the Campbell and Viceira
model.
These results and the following ones in this section face the limitations that the
analysis is optimal from an individual investors viewpoint but the equilibrium is not
considered. Possible equilibrium feedback effects on the asset prices and returns are
missing.
We have seen that predictable interest rate returns lead to a hedging demand. The
same holds true for any other assets. Consider equity. If equity is predictable, such as
a mean-reversion dynamics imply, then there will be an inter-temporal hedging demand
of stocks. Campbell and Viceira (2002) extend the model such that long-term investors
face an opportunity set that is time varying due to changing interest rates or changing
equity risk premia. Then, a striking result is that even a conservative investor will hold
stocks even if the expected excess return of the stock is negative. This conflicts with
the traditional short-term view that an investor only accepts risk if he or she is compensated for doing so. The intuition is as follows. First, we assume that the covariance
256
CHAPTER 3.
between risky asset returns at two consecutive future dates is negative. This captures
the notation that equity returns are mean-reverting: an unexpectedly high return today
reduces expected returns in the future. This describes how the investment opportunities
related to equity vary over time. If the average expected return is positive, the investor
will be typically long on stocks. Given the negative correlation, for stocks with a high
return today future return will be low and hence the investment opportunity set deteriorates. The conservative investor wants to hedge this deterioration. Stocks are just one
asset that delivers increasing wealth when investment opportunities are poor. Figure 3.4
illustrates, for a conservative investor, three alternative portfolio rules.
Figure 3.4: Portfolio allocation to stocks for a long-term investor, a myopic investor, and
for a CIO choosing the TAA (Campbell and Viceira [2002]).
The horizontal line represents the optimal investment rule if the expected excess
stock return is constant and equal to the unconditional average expected excess stock
return. The TAA is the optimal strategy for an investor who observes, in each period, the
conditional expected stock return. The myopic strategy and TAA cross at the point at
which the conditional and unconditional returns are the same. The TAA is still a myopic
investor with a one-period horizon. The SAA line represents the optimal investment of
a long-term investor. As stated above, there is a positive demand for stocks even if the
expected return is negative. This reveals that the whole discussion in this section can
be seen as describing the structure of strategic asset allocation (SAA). In fact, Formula
257
(3.21) can be transformed as follows:

(t) = Short-Term Weight + Opportunistic Weight
= Short-Term Weight - Long Run Myopic Weight
+ Long Run Myopic Weight + Opportunistic Weight
(3.34)
The long-term investor should hold long-term, inflation-indexed bonds and increase
the average allocation to equities in response to the mean-reverting stock returns (timevarying investment opportunities). Empirical tests suggest that the response to changing
investment opportunities occurs with a higher frequency for stocks than for the interest
rate risk factor. Therefore, this long-term weight or SAA should be periodically reviewed
and the weights should be reset.
3.4.4
Practice of Long-Term Investment
Whether or not investors use long-term investments as described in the last sections
depends on the following constraints taken from WEF (2011):
1. Liability profile - the degree to which the investor must service short-term obligations, such as upcoming payments to beneficiaries.
2. Investment beliefs - whether the institution believes long-term investing can produce
superior returns.
3. Risk appetite - the ability and willingness of the institution to accept potentially
sizable losses.
4. Decision-making structure - the ability of the investment team and trustees to execute a long-term investment strategy.
Comparing this with optimal investment formula (3.22), the point 3. is captured by
risk aversion, 2. defines the asset universe selection of the model and 1. is part of the
utility function, this mean an asset liability function is used.
Which assets are appropriate for long-term investment? While any asset can be
used for long-term investment, only liquid assets can be used for short-term investments.
Therefore, infrastructure, venture capital or private equity are typical long-term assets.
The WEF (2011) report then considers the question who is the long-term investors.
They build the following five categories. Family offices with USD 1.2 trillion AuM, endowments or foundations with USD 1.3 trillion AuM, SWFs with USD 3.1 trillion AuM,
DB pension funds with USD 11 trillion AuM and five insurers with USD 11 trillion AuM.
Matching these different types of investors to the above listed four constraints leads
to the following long-term investment table (Source for the table is WEF (2011) and the
many sources cited therein):
258
CHAPTER 3.
Investor
Family offices
Endowments
SWFs
DB pension funds
Insurers
Liability constraint
In perpetuity
In perpetuity
In perpetuity
D 2-15 yrs
D 5-15 yrs

Risk appetite
High
High
Moderate
Low
Low
Decision
Low
Low
Moderate
High
High
Estimated
35%
20%
10%
9%
4%
Table 3.2: Decision represents the decision making structure, D the average duration
and Estimated the estimated allocation to illiquid investments (WEF [2011]).
3.4.5
Fallacies
Wwhen asset returns are IID, the variance of a cumulative risky return is proportional
to the time horizon implying that the standard deviation is proportional to the square
root of the time horizon (the square-root rule). Since the Sharpe ratio uses standard
deviation, the ratio grows with the square-root of the time horizon. It is therefore
tempting to increase the investment time horizon to increase the Sharpe ratio. This is a
pseudo risk-return improvement since Sharpe ratios must always be measured over the
same time interval.
3.4.5.1
Equities are Less Risky than Bonds in the Long Run
Siegel states, in his 1994 work (Siegel [1994]):

It is widely known that stock returns, on average, exceed bonds in the long run. But it
is little known that in the long run, the risks in stocks are less than those found in bonds
or even bills! [...] But as the horizon increases, the range of stock returns narrows far
more quickly than for fixed-income assets [...] Stocks, in contrast to bonds or bills, have
never offered investors a negative real holding period return yield over 20 years or more.
Although it might appear riskier to hold stocks than bonds, precisely the opposite is true:
the safest long-term investment has clearly been stocks, not bonds.
Siegel measures risk by using the standard deviation and his advice is that long-term
investors should buy and hold equities due to the reduced risks of stock returns at long
maturities. But such a risk reduction only holds if stock returns are mean reverting.
Hence, returns are not IID. But as the discussion from the last section showed, a longterm buy-and-hold strategy is not optimal for an investor. The optimal strategy is a
strategic market timing strategy with a mixture of myopic and hedging demand parts.
If one follows Siegels advice, then the buy-and-hold investment strategy is not optimal.
The other logical direction is also true: an optimal long-term investment strategy does
not produce the suggested portfolio weights of Siegel.
3.5. RISK FACTORS

3.4.5.2
259
Growth Optimal Portfolios
The next example, regarding the growth optimal portfolio (GOP), has led to a great deal
of research, which started in the 1960s and has been paralleled by an intensive debate
(see Christensen [2005]). The GOP is a portfolio that has a maximal expected growth
rate over any time horizon. As a consequence, this portfolio is sure to outperform any
other strategy as the time horizon goes to infinity. The GOP strategy has the following
properties:
The fractions of wealth invested in each asset are independent of the level of total
wealth.
The fraction of wealth invested in asset i is proportional to the return on asset i.
The strategy is myopic - that is to say, the strategy is independent of the time
horizon. Christensen [2005].
From a classic economic point of view, the GOP strategy follows for investors with
logarithmic preferences. This portfolio outperforms any other portfolio with an increasing
probability if the time horizon increases. This fact follows from the statistical properties
of the return process. Strategies which dominate in the long run other strategies are
attractive for long-term investors. This is where the debate began in the 1960s. Besides some theoretical concerns, see Samuelson (1963), from a practical point of view
the time it takes for a GOP to dominate any other portfolio with a high probability is
the crucial point. Calculations - see Christensen (2005) - show that is takes almost 30
years to beat the risk-free rate with a 90 percent probability even for a Sharpe ratio of 0.5.
GOPs are in sharp contradiction to the optimal investment rules (3.21) for long-term
investors. First, GOPs are rationalized by log utility investors. But log utility investors
have a long-term hedging demand of zero. Second, GOPs dominate other investments in
the long run - that is, they are designed for long-term investors. But in general, long-term
investors care about long-term hedging demand.
3.5
3.5.1
Risk Factors
Returns and Risk Factors Sorting
The Fama-French approach revealed on the structural level that one can extract asset
returns by using sorts of the underlying assets such as for the value factor for example.
All the variance and mean or pricing information 25 size and book/market portfolios
sorting of Fama and French (5 sizes x 5 valuations) can be summed up in the mean and
variance of the three factors: The 25 portfolios are just a repackaging of the three factors.
To improve the FF model, one has to do better on the characteristics in the portfolio
as Fama and French did. The FF factors for example do not explain the momentum
260
CHAPTER 3.
characteristic which led to Carhart to introduce the non-tautological momentum factor

in the same portfolios. We discuss in the next section that many variables were published
in the last years which produce expected excess returns. Which of these factors are
truly independent, i.e. they generate risk premia, or which of them are subsumed by
other variables? This question is sensible since many seemingly different sorts are just
versions of the same economic content such as price/dividend ratio, book/market ratio
and price/earnings ratio. The traditional approach of Fama and French is limited in its
applicability. Their approach is to
consider a variable such as book to market,
forming a number of portfolios based on that variable (say the 25 Portfolios),
make a list of mean returns, betas and alphas of the portfolio,
check whether the returns and betas line up and if the alphas are small,
if necessary form coarser portfolio sorts such as considering the top 10 percent, the
bottom 10 percent etc. and use this as additional factors.
Today, this type of forming portfolios cannot be extrapolated to other characteristics
simply because the search for new characteristics is likely to be a multivariate problem:
The univariate traditional forecasters are correlated with each other such as a single factor which can for example be used to predict bond and stock returns jointly.
See Cochrane (2013), who introduces the concept of a characteristic such that the
univariate sorting mechanism can be generalized to the multivariate case.
So far we always considered fisk factor sorting as the approach to risk premia construction. But there other ones which are based on risk taking such as constructing a
premia on realized-implied volatility or durations.
3.5.2
Sustainability of Risk Factors
We already mentioned that there is a zoo of risk factors and that there are serious
doubts as to the persistence of these factors. To our knowledge the most recent and most
comprehensive study of this issue to date is that of Harvey et al. (2015). This section
is based on their paper. The objective of their paper is to define a statistical framework
that is suitable for testing statistical significance for the whole academic and practitioner
work in explaining the cross section of expected returns. It turns out, that the standard
criterion of using a t-ratio greater than 2.0 as a hurdle is no longer adequate. There are
three main reasons for this.
First, given papers which attempt to explain the same cross section of expected returns, statistical inference should not be based on a single test perspective. We should
apply an appropriate multi test procedure. To understand this, assume that each factor
3.5. RISK FACTORS
261
is tested individually with a possible Type I and Type II error.3 If we perform a multiple
test for the same factor, then are many possible ways to combine the Type I and Type
II errors and the probability of Type I error grows with the number of tests. Therefore
the traditional view t-ratio of 2 for single tests has to be increased if multiple tests are
used.
Second, there must be a huge number of putative papers that did not find any significant explanation for the cross section of expected returns. These papers were never
published and hence their information content did not enter the traditional statistical
setup. There are two reasons for these non-publications. You dont make an academic career in finance by publishing non-results and it is also difficult to publish a replication of a
successful argument. There is a bias toward publishing papers that establish new factors.
Third, Lewellen et al. (2010) show that the explanatory powers of many documented
factors are spurious using cross-sectional R-squared and pricing errors to judge the success of new factors. The Fama-French 25 size-B/M portfolios in their three factor model
explain more than 90%(75%) of the time-series variation in portfolios returns (crosssectional variation in their average returns). Any new factor added to this model which
is correlated with size and value but not with the residuals will produce a large crosssectional R-squared.
Harvey et al. (2015) apply a multiple testing framework to provide guidance for an
appropriate significance level of risk factors. They use 313 published works and selected
working papers and catalogue 316 (yes, three hundred and sixteen) risk factors. The authors motivate the multiple testing approach by considering the following example (see
Table 3.3).
Harvey at al. (2015) state:
Panel A assumes 100 published factors (the discoveries denoted as R). We suppose
that 50 are false discoveries and the rest are real ones. In addition, researchers have tried
600 other factors but none of them were found to be significant. Among them, 500 are
truly insignificant but the other 100 are true factors. The total number of tests (M ) is
700. Two types of mistakes are made in this process: 50 factors are falsely discovered to
be true and 100 true factors are buried in unpublished work. Usual statistical control in
a multiple testing context aims at reducing 50 or 50/100, the absolute or proportionate
occurrence of false discoveries, respectively. Of course, we only observe published factors
because factors that are tried and found to be insignificant rarely make it to publication.
That is all quantities except the total number of tests, a and r are unobserved. This poses
a challenge since the usual statistical techniques only handle the case where all test results
are observable.
3
not.
Type I errors in single tests refer to the probability of finding a factor that is significant while it is
262
CHAPTER 3.
Panel A
Truly
Truly
Total
Panel A
H0
H0
Insignifcant
Significant
True
False
Example
Unpublished
500
100
600
Testing Framework
N0|a
N1|a
Published
50
50
100
Total
550
150
700
N0|r
N1|r
M0
M1
Table 3.3: Contingency table in testing M hypotheses. Panel A shows a hypothetical

example for factor testing. Panel B presents the corresponding notation in a standard
multiple testing framework. Using 0 (1) to indicate the null is true (false) and a (r )
to indicate acceptance (rejection), we can easily summarize Panel A. For instance, N0|r
measures the number of rejections when the null is true (i.e. the number of false discoveries) and N1|a measures the number of acceptances when the null is false (i.e. the
number of missed discoveries). In a factor testing exercise, the typical null hypothesis is
that a factor is not significant. (Harvey et al. [2015]).
Example
We discuss that we cannot apply standard tests for significance if we face a selection
bias. Consider 10000 simulation paths of an investment strategy. We then choose the
best performing path. This path will show a very high Sharpe ratio and low risk. But to
test for significance of this strategy we cannot apply standard test since we have chosen
the best of all paths: How would we know how it had been arrived at? What about all
strategies that dont work and we dont hear about?
In multiple testing the goal is to control for type I and type II errors. In a multiple testing framework, restricting each individual tests type-I error rate at alpha is not
enough to control the overall probability of false discoveries. One needs a measurement
of the type I error that simultaneously evaluates the outcomes of many individual tests
- tests for joint occurences are needed. The type I error in multiple hypotheses is related
to false discoveries - concluding that a factor is significant when it is not. Therefore,
plausible definitions of the type I error should take into account the joint occurrence of
false discoveries. In the above example - see Table 3.3 - N0|r counts false discoveries and
N1|a counts missed discoveries.
We discuss the false discovery proportion (FDP) and the false discovery rate (FDR)
which take into account joint occurrences. These statistical tests are used in fields as
diverse as computational biology and astronomy. In effect, the method is designed to
simultaneously avoid false positives and false negatives in other words, conclusions
that something is statistically significant when it is entirely random, and the reverse.
3.5. RISK FACTORS

Hulbert (2008).
FDP is the proportion of type I errors defined by

N0,r , Fraction of false discoveris if R > 0;
FDP =
0,
if R = 0.
263
(3.35)
FDR measures the expected proportion of false discoveries among all discoveries - that
is to say, F DR = E[F DR]. In the above example, F DP = 50/100 = 50%, which is a
high rate.
Type II errors - the mistake of missing true factors - are also important in multiple
hypothesis testing. As for type I errors, b N1|a and N1|a = M R, are used to measure
severity.
As for single tests, simultaneously minimize type I errors and type II errors. A decrease in one type increase the error of the other one. To find a balance between the
two types one specifies a significance level for the type-I error rate and derive testing
procedures that aim to minimize the type-II error rate. The FDR control offers a way to
increase the power while maintaining some set bound on error. The idea of the control
is based on the assessment that 4 false discoveries out of 10 rejected null hypotheses is a
more serious error than 20 false discoveries out of 100 rejected null hypotheses.
The statistics literature has developed many methods for controlling FDR; we refer
to Harvey et al. (2015) for details about the following three methods for transforming
t-ratios into p-values: Bonferroni, Holm and Benjamini, Hochberg, and Yekutieli (BHY).
The authors derive the following results. Between 1980 and 1991, only one factor is
discovered per year growing to around five factors in the period 1991 - 2003. In the last
nine years, the annual FDR has increased sharply to around 18: 164 factors were discovered in the last nine years, doubling the cumulated 84 discovered factors of the past.
They calculate t-ratios for each of the 316 factors discovered, including those in working
papers. The vast majority of t-ratios exceed the 1.96 benchmark and the non-significant
factors typically belong to papers that propose a number of factors.
The authors then apply their method first to the case in which all tests of factor
cross-section returns are published. This false assumption defines a lower bound of
the true t-ratio benchmark. They obtain three benchmark t-ratios, two of which are
described by:
Factor-related sorting results in cross-sectional return patterns that are not explained by standard risk factors. The t-ratio for the intercept of the long/short
strategy returns regressed on common risk factors is usually reported.
Factor loadings as explanatory variables. They are related to the cross section of
expected returns after controlling for standard risk factors. Individual stocks or
264
CHAPTER 3.
stylized portfolios (e.g., FF 25 portfolios) are used as dependent variables. The tratio for the factor risk premium is taken as the t-ratio for the factor. Harvey at
al. (2015).
They then transform the calculated t-ratios into p-values for all three different methods. Then, these p-value are transformed back into t-ratios, assuming that standard
normal distribution accurately approximates the t-distribution, see Figure 3.5.
Figure 3.5 presents the benchmark t-ratios for the three different methods. Using
Bonferroni the benchmark t-ratio starts at 1.96 and increases to 3.78 by 2012 and will
reach 4.00 in 2032. A corresponding p-values for 3.78 is for example 0.02 percent which
is much lower than the starting level of 5 percent. Since Bonferroni detects fewer discoveries than Holm, the t-ratios of the later one are lower. BHY t-ratio benchmarks are not
monotonic but fluctuate before the year 2000 and stabilize at 3.39 after 2010.
Figure 3.5 shows the t-ratios of a few prominent factors - the main result in this
section:
Result 3.5.1. Book-to-market, momentum, durable consumption goods, short-run volatility and market beta are significant across all types of t-ratio adjustments, consumption
volatility, earnings-price ratio and liquidity are sometimes significant and the rest are
never significant.
The authors extend the analysis by testing, for example, for robustness and assuming
correlation between the factors. The above results did not change notably. The analysis
suggests that a newly discovered factor today should have a t-ratio that exceeds 3.0,
which corresponds to a p-value of 0.27 percent. The authors argue that the value of
3.0 should not be applied uniformly. For factors derived from first principles, the value
should be less.
Harvey et al. (2015) - Many of the factors discovered in the field of finance are
likely false discoveries: of the 296 published significant factors, 158 would be considered
false discoveries under Bonferonni, 142 under Holm, 132 under BHY (1%) and 80 under
BHY (5%). In addition, the idea that there are so many factors is inconsistent with the
principal component analysis, where, perhaps there are five statistical common factors
driving time-series variation in equity returns (Ahn, Horenstein and Wang (2012)).
3.6
Optimal Investment - The Herding of Pension Funds
Pension funds consider, by their very definition, an infinite time horizon in their investments since each year there are new entrants to the pension scheme. As long-term
investors, one would expect pension funds to focus on their long-term investment strategies. They should therefore behave differently than typical short-term asset-only managers. But there is a different investment motivation, which may counteract long-term
investment behavior: the fear of underperforming relative to their peer group, which
defines such funds incentive to herd.
3.6. OPTIMAL INVESTMENT - THE HERDING OF PENSION FUNDS
265
Figure 3.5: The green solid curve shows the historical cumulative number of factors discovered, excluding those from working papers. Forecasts (dotted green line) are based
on a linear extrapolation. The dark crosses mark selected factors proposed by the literature. They are MRT (market beta; Fama and MacBeth [1973]), EP (earnings-price
ratio; Basu [1983]), SMB and HML (size and book-to-market; Fama and French [1992]),
MOM (momentum; Carhart [1997]), LIQ (liquidity; Pastor and Stambaugh [2003]), DEF
(default likelihood; Vassalou and Xing [2004]), IVOL (idiosyncratic volatility; Ang, Hodrick, Xing, and Zhang [2006]), DCG (durable consumption goods; Yogo [2006]); SRV
and LRV (short-run and long-run volatility; Adrian and Rosenberg [2008]), and CVOL
(consumption volatility; Boguth and Kuehn [2012]). T-ratios over 4.9 are truncated at
4.9 (Harvey et al. [2015]).
Such herding may be stronger for institutional investors than for private investors.
First, there is more trade transparency between institutional investors than between
private investors. Second, the trading signals that reach institutional investors are more
correlated and hence increase the likelihood of eliciting similar reactions. Finally, because
266
CHAPTER 3.
of the size of the investments, institutional herding is more likely to result in stronger
price impacts than is the herding of private investors. Therefore, to adopt a position, as
an institutional investor, outside the herd will have a stronger return impact than would
such a position if adopted by private clients.
Blake et al. (2015) study the investment behavior of pension funds in the UK, analyzing - on an asset-class level - to what extent herding occurs. Their data set covers
UK private sector and public sector defined-benefit (DB) pension funds monthly asset
allocations over the past 25 years. They present information on the funds total portfolios
and asset class holdings, and are also able to decompose changes in portfolio weights into
valuation effects and flow effects.
These authors find robust evidence of reputational herding in subgroups of pension
funds. Similar pension funds follow each other. Public-sector funds for example follow
other public-sector funds of a similar size. This follows from a positive relationship between the cross-sectional variation in pension funds net asset demands in a given month
and their net demands in the preceding month. A second result is that pension funds seem
to use strong short-term portfolio rebalancing. Funds rebalance their long-term portfolios such that they match their liabilities. Since the maturity of pension fund liabilities
increased, pension funds have systematically switched from UK equities to conventional
and index-linked bonds.
The authors also find that pension funds mechanically rebalance their short-term
portfolios if restrictions in their mandates are breached. They therefore, on average, buy
in falling markets on a monthly basis and sell in rising markets. This is suboptimal given
the optimal investment rule (3.16). Therefore, pension funds investments fail to move
asset prices toward their fundamental values, and hence do not stabilize financial markets. The market exposure of the average pension fund and the peer-group benchmark
returns match very closely the returns on the relevant external asset-class market index.
This is evidence that pension fund managers herd around the average fund manager:
they could simply invest in the index without paying any investment fees.
As a final result, the pension funds studied captured a positive liquidity premium
contrary to the expectation that these long-term investors should be able to provide
liquidity to the markets and earn a risk premium in return.
3.7
Alternatives to Rational Models - Behavioral Approaches
Behavioral economics, which connects economics, psychology and other social sciences,
began in the 1980s, but failed to attract public attention until the 1990s (see the surveys
of the behavioral finance literature in Baker and Wurgler [2011], Barberis [2003], Shefrin
[2008], Shiller [2003], and Shleifer [2000]). This section is short. It does not, and cannot,
do justice to the importance of behavioral finance, in general terms merely reflecting the
3.7. ALTERNATIVES TO RATIONAL MODELS - BEHAVIORAL APPROACHES267

lack of knowledge of the author. The interested reader is strongly encouraged to study
the indicated literature.
But already Keynes considered in his 1936 book The General Theory of Employment,
Interest and Money speculative markets. It is known today as Keyness beauty contest
theory of the stock market.
Consider the following, hypothetical, contest being advertised in a newspaper. Each
reader can submit from a sample of 100 photos of pretty faces a list of the six he finds
the prettiest. The winner will be the one whose list most closely corresponds to the
most popular faces among all the lists of six that readers send in. To win this contest a
rational person picks the six faces that all others will think the prettiest, or better, that
the others will think that others will think the prettiest, etc.
A key Keynesian idea is that the valuation of long-term speculative assets is a matter
of convention. Whatever price people accepts as the conventional value, and that price
is embedded in the collective consciousness, will be the true value for a long time, even
if returns fail to be in line with expectations for some time.
There are many theoretical models of speculative markets, similar to Keyness beauty
contest theory, which stress the expectation of selling to other people with optimistic
beliefs. There are also different models representing bubbles, noise trader behavior, or
alternatives to rational, expected-utility-maximizing agents. The prospect theory of Kahneman and Tversky (1979) is such a well-known example.
Psychology identifies many different ways of human behavior which are relevant for
evaluating the EMH. There is for example evidence that the human tendency towards
overconfidence causes investors to too frequent trading (Odean [2000]), that CEOs allocate internal resources inefficiently on pet projects (Malmendier and Tate [2005]), that
investors are oversensitive to news stories (Barber and Odean [2008]) and that they overreact to cash dividends (Shefrin and Statman [1984]).
Richard Roll responded, in 1992, to Robert Shiller who had stressed the importance
of inefficiencies in the pricing of stocks:
I have personally tried to invest money, my clients money and my own, in every single
anomaly and predictive device that academics have dreamed up. [...] I have attempted
to exploit the so-called year-end anomalies and a whole variety of strategies supposedly
documented by academic research. And I have yet to make a nickel on any of these
supposed market inefficiencies [...] a true market inefficiency ought to be an exploitable
opportunity. If theres nothing investors can exploit in a systematic way, time in and
time out, then its very hard to say that information is not being properly incorporated
into stock prices.
268
3.8
CHAPTER 3.
Real-Estate Risk
The market for real estate is larger in valuation than that of the entire stock market. In
the US, the value of real estate owned by households in 2013 and non-profit organizations
was USD 21.6 trillion. Corporate equity shares had a market value of only USD 20.3
trillion. In Switzerland, the value of real estate in (2014) is about 4 to 5 times larger
than the value of all companies listed on the SIX exchange. Turnover on the stock market is around 100 percent per year. The equivalent liquidity in the real-estate market is
approximately 5 percent per year.
Despite the value of the real-estate market different initiatives have so far failed to
increase the liquidity of the markets risk premium significantly. This should not be
confused with the liquidity of firms in the real-estate sector - the real-estate markets
risk premium reflects true real-estate risk. Summarizing, the liquidity of an asset class
requires that the assets themselves are liquid. We consider the state of Zurich as an
example for the liquidity of the market in 2011.
The number of houses in the state was
Of which property
New constructions in 2011
Of which property
Arms-length transactions
Resales
690000
210000 (30%)
11000 (1.6%)
4300 (40%)
7110
3700 (1.7%)
Table 3.4: Liquidity for the state of Zurich.

This indicates that for Switzerland, where the holding period median value of the
private persons homes is 25 years, the construction of a repeated sales index is not
possible. The liquidity of the repeated sales is at 1.7% whereas the liquidity of the SPI
stock index varies between 80% and 120% per year.
3.8.1
US market: Repeated Sales Index versus Constant Quality Index
Parallel to this illiquidity there is also much less interest from academics to work and
publish on real estate as compared to the equity or fixed-income markets. There might
be three reasons for this. First, the data required for designing empirical analyses are
much less readily available for real estate. Second, the job market for academics may
lead many to prefer to work in areas where many other researchers are contributing and
where funds are more readily available. Third, the lack of good quality and liquid home
price indices poses problems for empirical work. Case and Shiller (1994) tested the efficiency of the US market for single-family homes. Since the resale of houses can occur
over time periods of decades, the usual tests that work for equity could not be applied.
The available home price indices had serious problems. Such indices often appeared to
jump around erratically or strong seasonalities were present in the data.
3.8.
REAL-ESTATE RISK
269
Other indices, such as the Halifax indices in UK, are not based on repeated house
sales but correct for different quality facts. That is, the price of a house is a weighted
sum of factors (square feet of floor space, view, shopping facilities, number of bedrooms,
location, etc.) where each factor is priced. Such indices are called hedonic indices and
contain between 20 and 30 different factors. At the time of the EMH test, the quarterly
published Constant Quality Index produced by the US Census Bureau was based on
where homes had been built. To offer an alternative, Case and Shiller constructed the
repeat sales home price index (Case and Shiller [1987, 1989, 1990]).
Figure 3.6: Two indices of US home prices divided by the Consumer Price Index (CPIU), both scaled to 1987=100. Monthly observations in the preriod19872013 are
considered (Shiller [2014]).
Figure 3.6 shows the two indices. Both indices are typically very smooth over time for real estate risk the price momentum dominates the volatility of prices. Furthermore,
the huge boom in home prices after 2000 is visible in the Case Shiller index but not in
the Census Constant Quality Index. Why is there this difference? New homes are built
where it is possible and profitable to build them. This is often not in the expensive area
of a city but outside. Therefore, the constant quality index level through time is more
accurately determined by simple construction costs in a country like the US where there
is a hugh reservoir of cheap land. The data of the Case-Shiller index show a strong predictability. Basically, real estate prices are mostly driven by a drift whereas short term
volatility plays almost no role. Therefore, for investors in real estate volatility strategies
are not useful. They should only go long or short and try to identify the turning points
when a drift changes sign.
270
CHAPTER 3.
The inefficiency leading to predictability must be related to market conditions. Changing the market conditions should improve efficiency. One source of inefficiency are the
high trading costs - to trade in and out in real estate markets is much more costly than
in stock markets. Furthermore, it is almost impossible to short sale overpriced homes.
Finally, there are other frictions such as high carrying costs, low rental income, the moral
hazard of renters and the difficulty of keeping up with all the local factors that might
change the demand for individual houses.
3.8.2
Constant Quality Index: Greater London and Zurich Area
Figure 3.7 shows the evolution of house prices in the Greater London and Zurich areas.
Both figures are measured using a hedonic model - the Halifax and the ZWEX indices,
respectively. The ZWEX for example is based on more than 20,000 arms-length transactions, which include condominiums and single-family houses in the State of Zurich.
Figure 3.7: Left Panel: The Halifax Greater London price index and the Zurich price
index (ZWEX) (ZKB and Lloyds Banking Group). Right Panel: Halifax Greater London
price index and forwards on the index (Syz and Vanini (2008)).
Figure 3.7, left panel, shows that in the mid-1990s house prices in Zurich and London
started to grow at different rates. The explosion in the London area is in line with the rise
3.8.
REAL-ESTATE RISK
271
of London and it becoming the worlds major financial center. The Zurich index grows at
a much lower rate, but - contrary to the Halifax index - the GFC was not observable in
Zurich house prices, while London prices dropped sharply, only to rebound in the same
manner.
The right panel in Figure 3.7 illustrates the behavior of forwards on the Halifax index
at different time periods and the realization of the index after the GFC. The forwards
in May 2007 still forecasted an increasing value of the house price index. This indicates
that the forecast power by the market participants to identify this turning point of the
momentum is not observable. During the downturn in the GFC, forward levels of the
index were sharply corrected downwards from month to month. The culmination point
was in October 2008 where the forward levels were predicted at a too low future value
but the turning point of the index was identified almost perfectly. Summarizing, the
market failed to foresee the fall of the index in the beginning of the GFC but did pretty
well in predicting the future price increases at the end of the GFC.
The EMH requires that markets are free of frictions. In housing markets there are
many sources of friction, which is one point used to explain why house prices are so predictable. Figure 3.8 shows friction sources for different types of real-estate investments
in Switzerland. Direct means that investors buy houses, indirect means to invest in
stocks that are related to housing and derivative refers to the synthetic wrapping of the
risk premia into indices such as the IPD, Case - Shiller, or ZWEX index.
Comparing this list one might wonder why it is so difficult-to-develop liquid, synthetic
real-estate asset markets given the many frictions direct investments face. Real-estate
markets remain wildly inefficient all over the world. To achieve improvements in efficiency
it is most helpful to understand the causes of market inefficiency.
3.8.3
Investment
Figure 3.9 provides an overview of investments and consumption in the real estate asset
class.
The derivative market for real estate has difficulty to take off not only in the US
but also in UK and other countries. We consider as an example the case of derivatives
on the ZWEX. In 2006 the idea was to launch simple warrants - calls and puts on the
ZWEX - both to allow investors to protect home owners capital against falling future
house prices and invest directly with a view in real estate market risk. Assume that a
home owner has a protection motivation. The homeowner for example did not bought a
traditional 5-year fixed rate mortgage but additionally a put option on the ZWEX. The
combination is called by Salvi et al. (2008) an Index Mortgage ZWEX. The rationale of
the put option is to finance possible forced amortizations at maturity of the mortgage.
Such forced amortizations occur if house prices fall sharply during the life-time of the
mortgage such that the homeowner crossed the hypothecation level.
272
CHAPTER 3.
Figure 3.8: Frictions for investment in real-estate markets in Switzerland. Lex Koller
is a federal law which restricts the purchase of property by foreigners (Syz and Vanini
[2008]).
Figure 3.9: Different use of the real estate asset class (Zrcher Kantonalbank (2015)).
3.9. RELATIVE PRICING - NO ARBITRAGE
273
Consider for example a house price of CHF 1 million and a maximum hypothecation
level of 80% which is standard in Switzerland. This means, that the homeowner has to
inject CHF 200, 000 when he purchases the house. Suppose that house price fall such
that after five years the house is estimated to be worth only CHF 800, 000. Then, 80%
of this value means CHF 6400 000. Therefore, the homeowner is forced to amortize CHF
1600 000 at the end of five years. This is a large amount and the rational of the put option
is exactly to finance this amount. How effective is such a hedge? Figure 3.10 shows the
effectiveness of the hedge for three different real estate house price evolutions. It follows
that the put option protects the equity of the homeowner.
Figure 3.10: Effectiveness of the put option hedge for a 5 year mortgage under three
different real estate price scenarios (Syz and Vanini (2008)).
The scenarios show that the put option price is 50 bps per annum. Given the low
interest rate environment the price for this protection should be acceptable to many
homeowners. The facts about the products success are different - one stopped to offer
them since there was no demand.
3.9
3.9.1
Relative Pricing - No Arbitrage

Main Idea
In absolute pricing models using SDF one can in principle price any asset. If one only
wants to value an asset relative to another asset, the following relative pricing method
can be used, for which much less information about investor behavior is needed. The
274
CHAPTER 3.
relative pricing approach uses prices about other assets to price a focus asset.
The purest relative pricing approach is arbitrage pricing. When it works, it ends
discussions over what are the true risk factors, the market price of risk, and so on. Every
investor who trades at a different price will be exploited by all other investors. The only
assumption about an investors preferences is that he or she prefers more money to less
- hence consumption does not enter into the approach.
The only general assumption is that there exists some discount factor that generates
the price of the focus asset (say, an option) and the basis assets (a stock or bond). The
existence of a discount factor requires that the Law of One Price or the no arbitrage condition holds true. The Law of One Price states that two portfolios that have the same
payoff at a future date in all possible states must have the same price today. No arbitrage
states that in a market with risk, there is always a positive probability of earning more
than a risk-free investment, and a positive probability of earning less. If arbitrage is
possible, money machines are possible and, hence, there is no sense in assuming financial
markets that systematically allow for arbitrage opportunities. We note that the absence
of arbitrage is a necessary condition for a financial equilibrium to exist in the absolute
pricing model.
As an example consider two banks, which value USD 1 tomorrow at 80 cents and at
90 cents, respectively. This violates the Law of One Price and an arbitrage opportunity
is simply to borrow as much as possible at the low price and invest it in the other bank
for the higher price. As a second example consider a stock with a price of USD 100
today. At a future date, the stock can be worth USD 120 or USD 110. Assume that
a risk-free asset pays, in the same period, a 5 percent interest rate. This market is not
free from arbitrage since borrowing as much as possible using the risk-free asset and
investing in the risky one always leads to a certain gain in an environment with risk.
Such an arbitrage opportunity is not sustainable but market participants exploit it and
thus make it vanish. Consider the following minimal model - the world consists of a stock
S, a risk-free asset B, and a call option C. There is only one time period with two dates
0 (today) and T (tomorrow). Figure 3.11 shows the payoffs of the model.
What is the price for the call option at time 0? The reader as the buyer of the call
should guess the price of the call.
Your price guess:
No arbitrage states that there is only one price that is fair in the sense that any other
price allows for arbitrary, risk-free gains in this risky environment. To show this the
reader should compare his or her guess price with 13.64 and perform the following calculation. If the guess is larger than 13.64, invest 50 in the risky asset and short,36.36,
the risk-less asset. The value of the portfolio at time zero is then 13.64. This portfolio is
a perfect hedge since it replicates all possible payoffs at time T exactly. Therefore, the
275
Figure 3.11: Payoffs of the financial market with a risky asset, S, a risk-free asset, B,
and a call option with a strike of 100, C.
surplus your guess 13.64 is superfluous for hedging the call and represents a risk-free
gain to the issuer in a risky environment. If the guess is smaller than 13.64. The issuer
will buy the call from you and sell it for the fair price 13.64 in the market. The difference
is again a risk-free profit for the issuer.
Summarizing, any price different from 13.64 represents a price that is either too high
from a buyers perspective (the issuer needs less to generate the liability value of the call
at T ) or is too low such that the issuer becomes the buyer of the option and sells this
option for the fair price in the market.
How is the fair price of 13.64 derived? One approach is to consider the buyer and
seller in the trade. The seller of the option wants to set up a hedge portfolio consisting
of S and B - there are no other assets in this world - at time 0 such that whatever the
future state is, the portfolio value will not be lower than the option payoff. The buyer,
on the other hand, wants to pay at time 0 an amount such that if the seller uses this
amount to buy the hedge portfolio, the value of this portfolio at time T is not strictly
larger than the option payoff. The condition that satisfies both agents needs is that the
hedge portfolio equals the option payoff at time T in all possible states. This defines
a linear system of two equations. Solving this equation and using the fact that if the
two assets have the same value tomorrow, then they must have the same price today (no
arbitrage), the price 13.64 follows. This defines the replication approach to price options.
276
CHAPTER 3.
The calculation of the price shows that the probability that the risky asset moves
up or down is irrelevant for option pricing - any belief from the investor is not needed.
But the view on whether Googles stock will double in coming years has an impact on
Googles stock price via the fundamental pricing equation. Although no arbitrage seems
to be unrelated to absolute pricing in equilibrium, there is in fact a strong relationship:
the absence of arbitrage is necessary for equilibrium to exist. If money machines exist;
an economy cannot be in equilibrium where markets clear.
Example
The example is from Papanikolaeou (2014). Consider an economy with two factors
- inflation and an interest rate. The factors can only be in one of two states - high or
low - and we know exactly how four securities, A, B, C, and D, will perform in these
states. The current price of each security is USD 100. The following table summarizes
the expected returns and standard deviations.
State / stock
Interest rate
Inflation rate
Probability of state
Return state A
Return state B
Return state C
Return state D
A
B
C
D
High interest rates

High inflation Low inflation
5%
5%
10%
0%
0.25
0.25
-20
20
0
70
90
-20
15
23
Expected return [%]
25
20
32.5
22.25
Low interest rates

0%
0%
10%
0%
0.25
0.25
40
60
30
-20
-10
70
15
36
Standard deviation [%]
28.58
33.91
48.15
8.58
Table 3.5: Description of the economy (Papanikolaeou [2014]).
We consider the return of an equally weighted (EW) portfolio of A, B, and C, and

compare it with the return of D, see 3.6. If follows that the return of the EW portfolio
dominates D in all states. Hence, there is an arbitrage opportunity.

State / stock
Equally weighted A, B,C
D
High interest rates

23.33
23.33
15
23
277
Low interest rates
20
36.67
15
36
Table 3.6: Arbitrage opportunity(Papanikolaeou [2014]).
3.9.2
Theory
The insights of the above simple one-period model transform to more realistic models.
The most famous is the Black - Scholes model, where derivative pricing takes place in
continuous time. The economic logic is the same as above. Using no arbitrage a unique
price for the option follows using the replication approach: the payoff of the option must
be the same as the payoff of the hedge portfolio at any future date. To avoid arbitrage,
the price of the option and the hedge must agree also at time zero, which prices the
option premium.
This approach to pricing options is equivalent to so-called risk neutral pricing. If the
market is free of arbitrage, the price of any option C at time t is given by the following
present value:
Ct = E Q (D(t, T )XT )
(3.36)
where XT is the payoff of the asset at maturity and D(t, T ) is the (stochastic) discount
factor. As in the absolute pricing equation (3.1), price is, in (3.36), equal to an expected
payoff. The risk-neutral probability Q is not the empirical probability. It is the probability that makes discounted risky asset prices random walks. The existence of such a
probability is equivalent to the absence of arbitrage opportunities. This is the content of
the Fundamental Theorem of Finance. Therefore, constructing Q and calculating (3.36)
means that options are free of arbitrage.
In the Black - Scholes world, markets are complete, or - in other words - the probability Q is unique and follows from all parameters within the model. Arbitrage is then
sufficient to price all options in this setup.
But often markets are incomplete. In the replication language, there are for example
underlyings that are not tradable but which need to be approximated by other assets.
Then, there is no perfect hedge or in the risk-neutral view Q is no longer unique. The
unique price of complete markets is then replaced by an interval where all prices in
the interval are arbitrage free. How, then, is a unique price fixed? Another criterion
is needed to fix the price among all arbitrage-free prices. But this adds preferences to
option pricing or this is the point where absolute pricing enters relative pricing. Arbitrage
pricing is technically challenging, both in complete and in incomplete markets. But such
a mathematical challenge is simpler to handle than the challenges in absolute pricing
where macroeconomics, behavior and introspection all affect the discount function.
278
3.9.3
CHAPTER 3.
CAPM and No Arbitrage
We start with the risky asset ks return Rk , which is assumed to be driven by a single
source of market risk F - that is to say,
Rk = ak + bk F,
where E(F ) = 0. We construct a portfolio of two assets, with a proportion invested in
asset k and 1 in asset j. The portfolio return
Rp = ai + bi F + aj + (1 )bj F
becomes risk free (the random components in F are zero) if we choose the weights
=
bj
.
bj bi
Then, the absence of arbitrage requires that the portfolio return equals the risk-free
return. Equating these two returns after some manipulations we get:
aj Rf
ai Rf
=
=:
bj
bi
This means that the equality of the two ratios for different assets means that the ratios
have to be equal to a constant value - the expected excess returns, per unit of risk, must
be equal for all assets. The same analysis holds if we start with the CAPM equation.
3.9.4
Arbitrage Pricing Theory (APT)
Rosss (1976b) arbitrage pricing theory (APT) is a mixture of a absolute and a relative
pricing problem.
APT focuses on the major forces that move aggregates of assets in large portfolios
and not on the idiosyncratic risks. It is based on the assumption that a few major
macroeconomic factors influence security returns. The influence of this factors cannot be
diversified away and therefore, investors price these factors. For example, most mutual
funds returns can be quite well approximated once we know the funds style in terms of
value, market, size, and a few industry groupings.
The general assumption of APT is that the number of assets N is assumed to be
much larger than the number of factors. APT postulates the existence of an empirical
factor structure in return and the existence of many assets. But APT does not assume
an economic equilibrium nor the existence of risk factors driving the opportunity set
for investments. CAPM and ICAPM both represent the SDF in terms of an affine
combination of factors whereas APT decomposes return into factors. CAPM explains
the risk premia; APT leaves the risk premia unspecified. Unlike the CAPM, ATP does
not assumes that all investors have the same preferences and that the tangency portfolio
is the only risky asset that will be held.
279
The idea of APT factor model is that common exposure to systematic risk sources
causes asset returns to be correlated. The risk of each asset is assumed to consist of a
systematic component and an idiosyncratic one where the idiosyncratic risks are assumed
to be uncorrelated across assets. In a large and diversified portfolio the idiosyncratic
risk contributions should be negligible due to the law of large numbers - investors holds
such a portfolio would require compensation (risk premium) only for the systematic part.
Therefore an assumption about asset correlation implies a conclusion about asset pricing.
Specifically, the assumptions underlying the APT are:
security returns can be described by a linear factor model;
there are sufficiently many securities available to diversify away any idiosyncratic
risk;
arbitrage opportunities do not exist.
Assume that there are k factor Fk with a non-singular covariance matrix CF and
consider N returns RN . Projecting the returns orthogonally on the set generated by the
factors plus a constant, we can write for returns:
Ri = E(R)i + cov(F, Ri )CF1 F + i
(3.37)
where Fk = Fk E(Fk ) is the centered random value of the factor k and the idiosyncratic
risk i satisfy E(j ) = cov(Fk , j ) = 0 and the residuals are assumed to be uncorrelated
across the assets (E(j k ) = 0 for all different j and k indices. The second term in (3.37)
is the systematic part and the third term is the residual part. The restriction that the
residuals should be uncorrelated across assets implies the following decomposition of the
covariance matrix (the same as in (2.95) ):
C = 0 CF + C
(3.38)
where C is a diagonal matrix with non-zero elements the variances of the idiosyncratic
risks, CF is the factor covariance matrix and is a m N matrix of betas.
Definition 3.9.1. Consider the return equation (3.37). The returns have a factor structure with the factors F1 , . . . , Fk if the residuals are all uncorrelated.
APT theory then states that when returns have a factor structure, then there is an
approximative beta pricing model with F1 , . . . , Fk the factors. Therefore, systematic risk
factors are beta pricing factors.
To understand APT, first assume that idiosyncratic risks are zero in the return decomposition (3.37). To derive an exact beta pricing model in this case, we use the
fundamental asset pricing equation E(M Ri ) = 1. Writing the product of the expectation in terms of single expectations plus the covariance term, inserting (3.37) for the
return and rearranging implies the beta pricing equation (3.14) in Proposition 3.2.2:
E(Rj ) = + 0 j
(3.39)
280
CHAPTER 3.
1
where = E(M
) cov(M, F ), =
1
E(M ) .
If the residuals are not zero, we get

E(Rj ) = + 0 j
E(M j )
E(M )
(3.40)
with the last term the pricing error. The main idea is then that the E(M j ) should be
zero since the residual risks can be diversified away because they are uncorrelated with
each other and with the factors using the same argument as in Proposition 2.6.2 where by
adding more and more uncorrelated assets portfolio risk can be made arbitrarily small.
There are two problems with this argument. First, if there are on a finite number of
assets then residual risk will not be exactly equal to zero. Second, even if there is an
infinite number of assets it might be not possible for all investors to hold well-diversified
portfolios if the market portfolio is for example not well-diversified. The conclusion of
the APT theorem is that if there are enough assets then the beta pricing equation, that
is there is zero pricing error, is approximatively true for most assets.
Example
Consider the stock of a gold mining company with a factor loading of 1.5 on a US
manufacturing index and a factor loading of 0.6 on inflation. If the manufacturing index
increases by 4% and inflation increases by 5%, we expect the return on the stock to
increase by 1.5 4% + 0.6 5% = 9%.
Example
Consider two assets with two different factor loadings but the same factor. What
should be the relationship between their expected returns under the assumption of no
arbitrage? Let be the weight of the first asset in a portfolio and 1 the weight of
the second. The portfolio returns then read (we set the idiosyncratic risk component, for
simplicity, to zero)
RP = (R,1 + b1 F) + (R,2 + b2 F)(1 ) .
Choosing = , the portfolio return becomes
RP =
(R,1 R,2 )b1

+ R,2
b2 b1
This is a risk-free portfolio. Therefore, the return must be equal to the risk-free rate 0 .
Rearranging, this implies
R,1 0
R,2 0
=
=.
b1
b2
3.10. FOUR ASSET PRICING FORMULAE
281
Since the two expressions on the left are the same for any asset, the ratios must be equal
to a constant value ; the factor risk premium since it represents the expected excess
return above the risk-free rate per unit of risk (as quantified by F). The two assets have
the same factor risk premium. Otherwise, arbitrage is possible. This implied equality
can be rewritten as
R = 0 + b .
(3.41)
Hence, no arbitrage implies that the factor model (??) satisfies the expected factor relation (3.41).
3.10
Four Asset Pricing Formulae
There are four asset pricing formulae:

The stochastic discount factor M pricing in (3.1).
The martingale pricing (no arbitrage relative pricing) in (3.36) under the risk neutral probability Q.
The single beta pricing model in (3.13) with R the return which covaries with the
risky assets return.
The state pricing model which we did not considered.
We also stressed that pricing can be absolute or relative. Figure 3.12 relates the different
concepts. The figure shows that the no-arbitrage condition is the key concept in both
absolute and relative pricing: If there is no arbitrage, then a state price density or SDF
or a martingale measure exist which are equivalent concepts. These concepts are then
used to price the existing or new assets. One could also use the law-of-one price in this
chart with some modifications. What happens in relative pricing if the markets are not
complete? Then, additional to NAC one needs to add a second criterion which means
one has to specify some preferences such that new derivative assets can uniquely priced.
The following proposition summarizes the relationships between the four concepts.
Proposition 3.10.1. Consider a finite discrete time and a finite number of states financial market.
1. The law of one price holds if and only if there is a least one SDF.
2. There is no arbitrage opportunities if and only if there exist at least on strictly
positive SDF (or a R ).
3. There is no arbitrage opportunities if and only if there exists a risk neutral probability Q.
282
CHAPTER 3.
Specification of preferences,
technologies, market structure
Observation of asset prices, prefer more

to less money, specification of market
structure
NAC
Absolute pricing
Pricing of all assets in the market
NAC
Existence of
State prices density
SDF
Martingale Measure
Relative pricing
Pricing of new derivative assets in the

market if market structure is complete
Figure 3.12: Absolute and relative pricing overview. NAC means the no-arbitrace
condition.
4. The SDF (or M or R or Q) is unique if and only if the financial market is
complete.
The proof of the proposition and its applications are discussed in the exercises. Part
3. is known as the First Fundamental Theorem of Finance and part 4. as the Second
Fundamental Theorem of Finance.
Chapter 4
Global Asset Management

The asset management industry faces turbulent years ahead. More fundamentally important than regulation for current and future challenges to the industry are economic
growth, demographic changes, and technology. These factors will continue to shape asset management. Asset management has long been in the shadow of its cousins in the
banking and insurance industries.
The underlying factors related to these changes are outlined by Walter (2013), UBS
(2015), PwC (2015) or they follow from the last two chapters. First, the zoo of risk
factors and the number of several hundred existing global asset management strategies
require that the asset manager focus their offering. Furthermore, the cooperation between academics and practitioners needs to concentrate more on the fundamental issues
in investment theory and less on applied research which often fails to have a sound economic foundation. Second, AM faces the chance of increasing wealth in developed and
developing countries - that is to say, a shift in the investor base. Third, the trend of
managing household assets in the form of professional, collective schemes such as mutual
funds (US) or SICAV vehicles (eurozone) will continue. Fourth, technology is changing
the asset management process both on the client interface and in the middle and backoffice. The robo-advisor is here. The technology has a potential for radically changing
the way how AM services are produced and distributed. Fifth, untenable, governmentsponsored pension systems (pay-as-you-go schemes) need to be replaced by asset pool
systems, which are in line with demographic changes. Sixth, the search for alternative
asset classes due to the increasing efficiency, and hence decreasing alphas, of traditional
asset classes. Seventh, the distribution is being redrawn both globally and locally. Platforms will dominate due to economies of scale, mastering regulatory complexity, open
architecture offering and cost transparency. Eight, fee models are transformed. Finally,
alternatives are becoming become more mainstream and exchange traded funds (ETFs)
proliferate.
283
284
CHAPTER 4.
4.1
Asset Management Industry
4.1.1
The Demand Side
GLOBAL ASSET MANAGEMENT
The clients of the AM industry are segmented into private and institutional clients. Institutional clients include pension funds, insurance companies, family offices, corporate
treasuries, and government authorities.
There are several differences between the two categories. Retail clients pay more
than institutional investors. While institutional investors ask for pure asset management services, private clients often combine their asset management demands with other
banking services such as financial planning or mortgage lending. Private clients invest
more heavily in wrappers of investment solutions such as mutual funds, ETFs or retail
structured products, while institutional clients invest in cash products (bonds or stocks).
Institutional clients have often better access to alternative investments such as hedge
funds, private equity, and structured finance products. See Section 2.3 for differences
regarding the regulation.
4.1.2
The Supply Side
Trading units of banks and asset management firms are the suppliers of assets for investment. Asset management solutions such as mutual funds or ETFs are often offered
by non-banking firms such as investment management corporations. BlackRock, for example, is the worlds largest asset manager. These firms issue products but also provide
other services. BlackRock Solutions - the risk management division of BlackRock - was
mandated by the US Treasury Department to manage the mortgage assets owned by
Bear Stearns, Freddie Mac, Morgan Stanley, and other financial firms that were affected
by the financial crisis in 2008.
The largest asset management organizations in 2014 were BlackRock with USD 4.4
trillion AuM followed by the Vanguard Group. The largest fund in 2014 was the SPDR
ETF on the S&P 500 managed by State Street Global Advisors with assets of USD 200
bn; see the Appendix for details.
4.1.3
Asset Management Industry in the Financial System - the Eurozone
We follow EFAMA (2015). Asset management companies are one channel between
providers and users of funds in the case where the parties do not exchange the assets
directly by using organized market places. AM firms provide a pooling of funds for investment purposes. Banks, another channel, offer also non-asset management functions
(deposits, loans, etc. Insurance companies or pension funds take savings from households
or companies and invest them in money markets and capital markets.
4.1.
ASSET MANAGEMENT INDUSTRY
285
The main services of the AM industry to clients are savings management (diversification, reduction of risk by screening out bad investment opportunities), liquidity provision
(providing liquid asset to clients while investing in not necessarily liquid assets) and reduction of transaction costs (the size of a transaction reduces the costs).
But the asset management firms also contribute to the real economy. Firms, banks
and governments use AM firm to meet their short-term funding needs and the long-term
capital requirements. The AM contribution to debt financing is 23%, this means that
European asset managers held this amount of all debt securities outstanding which also
represents 33% of the value of euro-bank lending. The equity financing figures are similar. The AM industry held 29% of the market value of euro area listed firms and 42% of
the free-float.
From a corporate finance perspective, the valuation and market capitalization of asset
management firms compared to banks and insurance companies between 2002 and 2015
is as follows (McKinsey (2015)):
Feature
Market Cap (100 in 2002)
P/E ratio
P/B value
Asset management firms

516
16.1
3.2
Banks
313
11.3
1.2
Insurance
231
14.8
1.6
Table 4.1: Key figures 2015 for asset management firms, banks and insurance companies.
(McKinsey [2015])
The number of asset management companies in 2013 in Europe is approximately
30 300. France (approx. 600), Ireland (430), Luxembourg (360), Germany (300), UK
(200), Netherlands (200) and Switzerland (120) are the leading places. The high number
of Ireland and Luxembourg is due to their role played in the cross-border distribution
of UCITS funds (see below). The main asset management center where the investment
management functions are carried out is London. The average AuM per asset manager
range from EUR 9 billion in UK to less than one billion in Portugal and Turkey for
example. The industry is highly concentrated in each country. The top 5 asset managers
in Germany control 94% percent of all assets and in the UK the corresponding figure is
still 36%.
The asset management companies can operate as independent firms or as part of a
larger financial intermediary such as a bank. In UK and France, less than 20% of the
firms are owned by banking groups. In Germany (60%) and Austria (71%) of the asset
management functions are part of a bank. Insurance companies play a significant role in
Italy, UK, France and Germany (all 13%) and in Greece (21%).
The number of individuals directly employed (asset managers, analysts) in the industry is estimated at 900 000 with the dominant part of one-third in the UK. The indirect
286
CHAPTER 4.
employment such as IT, marketing, legal, compliance and administration is estimated

to boost the total number of employees in the whole industry up to a half-a-million
individuals.
4.1.4
Global Figures 2007-2014
The following figures are from McKinsey (2015).

Per annum, global AuM growth between 2007 and 2014 is 5%. The main driver
was market performance. Typically, the net AuM flows are between 0% and 2%
per annum.
From 2007 to 2014, the growth of AuM is 13.1% in Europe, 13.5% in North America
and 226% in emerging markets. The growth in the emerging markets is largely due
to the money market boom in China.
The absolute value of profits increased in the same period in Europe by 5%, 29%
in North America and 79% in the emerging markets.
Profit margins as the difference between net revenues margin and operating cost
margin are 13.3 bps in Europe, 12.5 bps in North America and 20.6 bps in emerging
markets. The observed revenue decline in Europe is due to the shift from active
to passive investments, the shift to institutional clients and the decrease in management fees. The revenue margin in the emerging markets is only slightly lower
in 2014 compared to 2007 (down to 68.1 bps from 70.6 bps) but the increase in
operating cost margin from 33.8 bps to 47.4 bps in 2014 is significant.
The absolute revenues in some emerging markets such as China, South Korea,
Taiwan are with values between USD 10.1 bn to USD 3.7 bn. They are almost at
par with the revenues in Japan, Germany, France and Canada (all around USD 10
bn). The revenue pools of UK (USD 21.2 bn) and the US (USD 150.8 bn) are still
leading the global league table.
The cost margins in Europe are stable between 21 bps and 23 bps between 2007 and
2014. The split of the cost margin is in sales and marketing (around 5 bps), fund
management (around 8 bps), middle/back office (around 3.5 bps) and IT/support
(around 6 bps). There is a cost increasing trend for IT/support, decreasing costs
for sales and marketing and middle/back office.
By considering the above facts one should not underestimate the particular circumstances
in the years after the GFC such as the decreasing interest rates level and stock market
boom which were the main factors in the success of the asset management industry in
this period.
From a customer segment perspective, retirement/DC grew with a Compounded Annual Growth Rate (CAGR) of 7.5% almost twice as strong as the retail sector with 4%
4.1.
287
between 2007 and 2014. The institutional customers CAGR was 5%. These average
global rates differ for different geographic regions. The retirement/DC CAGR dominates
in Europe the retail one by a factor of 4 whereas in the emerging markets, the CAGR
for institutional customers is 13% compared to 11% for retirement/DC.
Figure 4.1 shows the distribution of global investable assets by region and by type of
investor.
Figure 4.1: Global investable assets by region in trillions of USD (Brown Brothers Harriman [2013]).
Comparing the growth of investment funds versus discretionary mandates in Europe,

both categories have increased in 2014 to a similar level of EUR 9.1 trillion in investment
funds and EUR 9.9 trillion in discretionary mandates (EFAMA (2015)). The share of
investment funds compared to the mandates was falling from 2007 until 2011 but it then
started to increase in the last three years. While mandates represented more than 70%
of the AuM in the UK, Netherlands, Italy, Portugal, and more than 70% of the all AuM
in Germany, Turkey or Romania were invested in investment funds. The dominance of
either type of investment can have different causes. In the UK and the Netherlands
pension funds play an important role in asset management and they prefer to delegate
the investment decisions. The pool of professionally managed assets in Europe remains
centered in the UK (37% market share), France (20%), Germany (10%), Italy, Nordic
countries and Switzerland.
288
CHAPTER 4.
4.1.5
Asset Management vs Trading Characteristics
The asset managers are characterized by features which distinguish them from other
financial intermediaries such as banks, pension funds or insurance companies. Some key
features are:
Agency business model. Asset managers are not the asset owners, they act on a
best effort basis for their clients and the performance is attributed to their clients.
Low balance sheet risk. Since asset managers to not provide loans, to not act as
counter parties in derivatives, financing or securities transactions and they seldom
borrow money (leverage) their balance sheet does not face the risk of a banks
balance sheet.
Protection of client assets. Asset managers are regulated and in mandated asset
management, the client assets are held separately from the asset management firms
assets.
Fee based compensation. Asset managers generate revenue principally from an
agreed-upon fee. There is no profit and loss as in the trading.
From a risk perspective, asset management is a fee business with conduct, business,
and operational risk as the main risk sources. Trading is a mixture of a fee (agency
trading) and a risk-taking business (principal and proprietary trading). Agency trading is a fee business based on client flow. Clients place their orders and the trading
unit executes the orders on behalf of the clients account. For example, a stock order
is routed by the trader to the stock exchange where the trade is matched. The bank
receives a fee for this service. Principal trading already requires active market risk or
counterparty risk taking by the bank since the banks balance sheet is affected by the
profits and losses from trading. Principal trading is still based on clients orders but it
requires the traders to take some trading positions in their market-making function or
in order to meet future liabilities in issued structured products. This is a key difference
to agency trading. Proprietary trading is not based on the clients flow at all. Proprietary traders implement trading ideas without any reference to a client activity. This
type of trading puts the banks capital at risk. New regulations limit proprietary trading
by investment banks such as the The Volcker Rule in the US and ring-fencing in the UK.
AM firms wrap the underlying assets into collective investment schemes (funds)
while the trading of a bank offers issuance and market making for cash products, derivatives, and structured products. Despite their difference, trading and asset management
are linked. Portfolio managers in the asset management function execute their trades
via the trading unit or a broker. The market making of ETF and listed fund trading
takes place in the trading unit. Cash products are used by the asset management function in their construction of collective schemes and asset managers use in their portfolios
derivative (overlay) to manage risk and return characteristics.
4.1.
4.1.6
289
Dynamics of the Asset Management Industry
The following forces - besides the yet mentioned growth of assets, demographic changes
and technological progress - influence the dynamics of the asset management industry.
Regulation imposes a great deal of complexity on the whole business of asset management and banking. On the other side of the fence, there is a so-called shadow banking
sector with much less regulatory overview. Although the expression shadow bank makes
no sense at all - either an institution has a banking license or not - there is an incentive for
banks to consider outsourcing their asset management units to these shadow banking
sector.
Traditional and non-traditional asset managers (alternative asset class managers)
roles are converging. Traditional asset managers have continuously lost market share
to low-cost ETFs. They therefore consider liquid alternative products to stop the bleeding. This is one reason for the convergence. Non-traditional asset managers, on the other
hand, want to expand into traditional segments since their non-traditional products are
becoming more liquid and more transparent. This is the other reason for the coming
together of the two, previously distinct, roles. The hedge fund AQR Capital Management opted for the Company Act Of 1940 (the 40-Act) mutual fund industry regulatory
regime. This act requires much more transparency in reporting than hedge funds usually
provide. This allowed AQR access to a new customer base. This business had grown to
USD 19 billion AuM by 2014.
4.1.7
Institutional Asset Management versus Wealth Management
Investors are in institutional asset management (IAM) are legal entities such as pension
funds and in wealth management WM private clients. The investment goal in IAM is
often based on an non-maturing asset-liability analysis while in WM the goal is linked
to the life cycle of the client. Although, this defines long-term investment horizons for
both types of investors, we refer to Section 3.6 for difficulties of pension funds f to follow a long-term strategy. If WM clients use short- or mid-term investment horizons,
opportunistic behavior is motivated. The performance of the investment for IAM is
benchmarked while WM clients also prefer absolute returns. Therefore, for IAM beta
is the first concern and alpha is added in a satellite form. The responsibility for the
performance in IAM is attached to investment boards, CFOs, board of trustees. In WM,
the mandate manager is responsible for the performance. IAM companies use several
mandates, often one for each asset class, to manage investments while WM either use a
fewer number of mandates or even decide by their own in the advisory channel.
The size of investment is very huge for IAM and smaller for WM. The risk management for IAM is comprehensive and of the same quality as it is used by say banks
for their own purposes. In WM risk management is often less sophisticated. Fees are
290
CHAPTER 4.
typically lower for IAM than for WM. While IAM are highly regulated the regulation
of WM was in the past much less strong. This changed after the GFC where MiFID II,
Know-Your-Client, product information sheets, etc. heavily increases the WM regulation
setup. Finally, the loyalty of IAM clients is decreasing while WM clients are more loyal.
It will be interesting to observe in the future how loyalty of WM clients will change if
technology will make investments not only more tailor-made but also more open platform
oriented and therefore, less strongly linked to the home institution of the WM clients.
4.2
The Fund Industry - An Overview
In 1774 Abraham van Ketwich, an Amsterdam broker, offered a diversified pooled security specifically designed for citizens of modest means. The security was similar to a
present day closed-end fund. It invested in foreign government bonds, banks, and West
Indian plantations. The word diversification is explicit in the prospectus of this fund.
The 1920s saw the creation in Boston of the first open-end mutual fund - the Massachusetts Investors Trust. By 1951 more than 100 mutual funds existed and 150 more
were added in the following twenty years. The challenging 1970s - oil crisis - were marked
by a number of innovations. Wells Fargo offered a privately placement, equally weighted
S&P 500 index fund in 1971. This fund was unsuccessful and Wells created a successful
value-weighted fund in 1973. It required hugh efforts - tax and regulatory compliance,
build up stable operations and education of potential investors. Bruce Bent established
the first money market fund in the US in 1971 such that investors had access to high
money market yields in a period where bank regulated interest rates. In 1975, John
Bogle create a mutual fund firm - Vanguard. They launched 1976 the first retail index
fund based on the S&P 500 Index. In 1993, Nathan Most developed an ETF based on
the S&P 500 Index.
The fund industry is not free of scandals. In 2003 for example illegal late trading and
market timing practices were uncovered in hedge fund and mutual fund companies. Late
trading means that trading is executed after the exchanges are closed. Traders could buy
mutual funds when markets were up at the previous days lower closing price, and sell at
the purchase dates closing price for a guaranteed profit.
4.2.1
Types of Funds and Size
There are different types of funds: Mutual funds, index funds, ETFs, hedge funds and
alternative investments. We note some characteristics:
Index mutual funds and most ETFs are passively managed.
Index funds seek to match the funds performance to a specific market index, such
as the S&P 500, before fees and expenses.
4.2. THE FUND INDUSTRY - AN OVERVIEW
291
Mutual funds are actively managed and try to outperform market indexes. They
are bought and sold directly from the fund company at the current days closing
price - the NAV (net asset value).
ETFs are traded throughout the day at the current market price, like a stock, and
may cost more or less than their NAV.
Example
NAV is a companys total assets minus its total liabilities. If an investment company
has securities and other assets worth USD 100 and has liabilities of USD 10 , the
companys NAV will be USD 90 . Since assets and liabilities change daily, NAV also
changes daily. Mutual funds generally must calculate their NAV at least once every
business day. An investment company calculates the NAV of a single share by dividing
its NAV by the number of outstanding shares.
We assume that at the close of trading a mutual fund held USD 10.5 million worth
of securities, USD 2 million of cash, and USD 0.5 million of liabilities. If the fund had 1
million shares outstanding, the NAV would be USD 12 per share.
Funds can be open- or closed-end. Open-end funds are forced to buy back fund shares
at the end of every business day at the NAV. Prices of shares traded during the day are
expressed in NAV. Total investment varies based on share purchases, share redemptions,
and fluctuations in market valuation. There is no limit on the number of shares that
can be issued. Closed-end funds issue shares only once. The shares are listed on a stock
exchange and trading occurs via the exchange: An investor cannot give back his or her
shares to the fund but must sell them to another investor in the market. The prices
of traded shares can be different to the NAV - either higher (premium case) or lower
(discount case). The vast majority of funds are of the open-end style.
The legal environment is crucial for the development of the fund industry. About
three-quarters of all cross-border funds in Europe are for example sold in Luxembourg.
Luxembourg offers favorable framework conditions for holdings/holding companies, investment funds, and asset-management companies. These companies are partially or
completely tax-exempt; typically, profits can be distributed tax free. For private equity
funds, two-thirds have the US state of Delaware as their domicile. For hedge funds onethird are in the Caymans; one-quarter in Delaware. As of Q3 2013, 48 percent of mutual
funds had their domicile in the US, 9 percent in Luxembourg, and around 6 percent in
Brazil, France, and Australia, respectively.
Table ?? illustrates the global distribution of AuM by product and its dynamics in
the last decade.
The table indicates that the growth rate of passive investments is larger than for active
solutions. McKinsey (2015) states for the period 2008-2014 that the cumulated flows are
292
CHAPTER 4.
Feature
Number of outstanding shares
Pricing
Redemption
Market share
US terminology
UK terminology
EU terminology
Open-end fund
Flexible
Daily NAV
At NAV
> 95%
Mutual fund
Unit trust
SICAV
Closed-end fund
Fixed
Continuous demand and supply
Via exchange
< 5%
Closed-end fund
Investment trust
SICAF
Table 4.2: Features of open-end and closed-end funds. A SICAV (Socit dInvestissement
a Capital Variable) is an open-ended collective investment scheme. SICAVs are crossborder marketed in the EU under the UCITS directive (Undertakings for Collective Investments in Transferable Securities, see below). SICAFs are the closed-end fund equivalent of SICAVs.
Investment type
Passive /ETF
LDIs
Active Core
Active Solutions
Alternatives
2003
2
0.6
24.8
8.2
1.9
2008
3.3
1.6
28.1
10.8
3.9
2012
7.9
2.5
30.9
15.1
6
Table 4.3: Global distribution of AuM by product and its dynamics in the last decade in
trillion USD. Alternatives includes hedge, private-equity, real-estate, infrastructure, and
commodity funds. Active solutions includes equity specialties (foreign, global, emerging
markets, small and mid caps, and sector) and fixed-income specialties (credit, emerging
markets, global, high yield, and convertibles). LDIs (liability-driven investments) includes absolute-return, target-date, global-asset-allocation, flexible, income, and volatility funds. Active core includes active domestic large-cap equity, active government fixedincome, money market, and traditional balanced and structured products (Valores Capital Partners [2014]).
4.3. MUTUAL FUNDS AND SICAVS
293
36% for passive fixed income and 22% for passive equity. Standard active management
is decreasing for some asset classes and strategies: Active equity strategies lost 20% on
a cumulated flow basis while active fixed income gained 52%. A next observation is
that active management of less liquid asset classes, or with more complex strategies, is
increasing. An increase of 49% cumulate flows for active balanced multi asset and of 23%
for alternatives. The global figures vary strongly for different regions or countries. Swiss
and British customers adopted the use of passive much faster than for example Spanish,
French or Italian investors.
4.3
Mutual Funds and SICAVs
The Securities and Exchange Commission (SEC) defines mutual funds as follows:
Definition 4.3.1. A mutual fund is a company that pools money from many investors
and invests the money in stocks, bonds, short-term money-market instruments, other
securities or assets, or some combination of these investments. The combined holdings
the mutual fund owns are its portfolio. Each share represents an investors proportionate
ownership of the funds holdings and the income those holdings generate.
In Europe, mutual funds are regulated under the UCITS regime and the funds themselves are called SICAVs. We consider first the US industry and then its European
equivalent. When we refer below to mutual funds, we always have US mutual funds in
mind. The characteristics of mutual funds are:
Investors purchase mutual fund shares from the fund and not via stock exchange.
They can sell their share any time.
The investors pay for mutual fund shares the NAV plus any shareholder fees that
the fund imposes at the time of purchase.
If there is a new demand, mutual funds create and sell new shares.
The investment portfolios are managed by separate entities (investment advisers)
that are registered with the SEC.
Mutual funds are public companies but their shares do not trade at a stock exchange.
They neither pay taxes nor have any employees. The major benefits of mutual funds for
investors are:
Diversification.
Professional management.
Investor protection (regulation)
Affordability - the basic unit of a fund unit requires only little money from the
investors.
294
CHAPTER 4.
Access to assets. Funds allow investors to invest in asset classes that would be
inaccessible on a stand-alone basis.
Partial transparency about the investment process, performance, the investment
portfolio, and the fees.
Default remoteness. Fund capital is treated as segregated capital.
Liquidity. Mutual fund investors can redeem at any time their shares at the current
NAV plus any fees and charges assessed on redemption.
Investment strategy. The investor can choose between active and passive investment, can have access to rule-based strategies, etc. Contrary to structured products
the payoff of a actively managed fund at a future date is not a mathematical formula. That is to say, investors in funds believe that the fund managers will generate
a positive return due to their skills and access to information.
Some disadvantages of mutual funds:
Lack of control. Investors do not know at any time the exact composition of the
portfolio and they have no influence on which securities the fund manager buys
and sells or the timing of those trades.
Price uncertainty. Since pricing follows the NAV methodology, which the fund
might calculate hours after the placement of an order. This contrasts other financial
instruments such as stocks, options or bonds.
Performance. The average estimated alpha in the mutual fund industry is negative
after costs.
PwC estimates that actively managed funds will grow at an CAGR of 5.4 percent
and mandates with 5.7 percent (PwC [2014]). The actively managed funds growth driver
is the growing global middle-class client base. Mandates growth factors are institutional
investors (pension funds and SWFs) and HNWIs. Table 4.4 summarizes some key figures. Furthermore, the ratio active:passive = 7:1 by 2012 and is estimated to fall to
Investment type
Actively managed funds
Mandates
Alternative investments
2014 - USD trillions

30
32
6.9
2020 - USD trillions

41.2
47.5
13
Table 4.4: Actively managed funds, mandates, and alternative investment (PwC [2014]).
3:1 by 2020. By the end of 2014, the AuM in actively managed funds are distributed
as follows - 60% in the Americas, 32% in Europe, and 12%in Asia. Compared to 2010,
there is a relative stagnation or decrease in Europe and Asia whereas the proportion in
the Americas is increasing.
295
The Investment Company Institute and US Census Bureau [2015] states that a total
of 43.3 percent of US households own mutual funds. The majority of mutual fund-owning
households are employed and earn moderate, although above-average, household incomes
with a median income of mutual fund-owning households was USD 85, 000 in 2013. The
median mutual fund holdings are USD 103, 000 and the median of household financial
assets is USD 200, 000. 86% own equity funds, 33 percent hybrids, 45% bond funds,
and 55% money-market funds. Only 36% was invested in global or international equity
funds. Finally, the primary financial goal (74 percent) for mutual fund investment was
retirement goals.
4.3.1
US Mutual Funds versus European UCITS
Mutual funds and SICAVs are both collective investment schemes. But there are some
major difference between the two types of wrapper and the entire industries. We follow
Pozen and Hamacher (2015).
Cross-border distribution has been most successful within the European UCITS format. This is not only true for Europe. UCITS dominate global fund distribution in more
than 50 local markets (Europe, Asia, the Middle East, and Latin America). This kind
of global fund distribution is the preferred business model in terms of economies of scale
and competitiveness. In 2016 around 80,000 registrations for cross-border UCITS funds
exist. The average fund is registered in eight countries. Furthermore, UCITS are not
required to distribute all income annually.
UCITS do not need to accept redemptions more than twice a month. Although the
two previous points hold in general, many funds offer - for example - the option to distribute income annually or make redemptions possible on a daily basis. UCITS sponsors
must comply with the EU guidelines on compensation for key personnel: the remuneration directive.
Both, UCITS funds and mutual funds originally were quite restrictive in their investment guidelines. Then UCITS (similar remarks apply to mutual funds) were allowed
to use derivatives extensively. Using derivatives means, among other things, leveraging
portfolios or creating synthetic short positions - UCITS are not allowed to sell physical
assets short. The strategies of these funds - referred to as newCITS - are similar to
hedge fund strategies and they showed strong growth to USD 294 billion in 2013 according to Strategic Insight (2013).
But there are also differences between US mutual funds and European UCITS on
a more fundamental level. US clients invest in existing funds while European investors
are regularly offered new funds. That is, the number of US mutual funds has been
decreasing in the last decade while the European funds have showed a strong increase
in numbers; see Table 4.5. The stability of the US fund industry is due to the influence
of US retirement plans (defined contribution), which do not change investment options
296
CHAPTER 4.
often. The tendency to innovate permanently in Europe leads to funds which on average
around six-times smaller than their US counterparts.
US
Number of funds
Total Assets USD tr
Asset per fund USD mn
Europe
Number of funds
Total Assets USD tr
Asia
Number of funds
Total Assets USD tr
2003
2013
8,125
7.4
911
7,707
15.0
1,949
28,541
4.7
164
34,743
9.4
270
11,641
1.4
116
18,375
3.4
183
Table 4.5: Number of funds, average fund size and assets by region (Investment Company
Institute [2010, 2014] and Pozen and Hamacher [2015]).
4.3.2
4.3.2.1
Functions of Mutual Funds

How They Work
Buying and selling mutual funds is not done via a stock exchange - the shares are bought
directly from the fund. Therefore, the share price is not fixed by traders but is equal to
the net asset value (NAV), which is calculated daily; see Section 4.3.5 for details of the
NAV. For the time being, NAV is by definition equal to the difference in assets and liabilities divided by the number of shares. Investors then pay the NAV plus the sales load
fee when they buy; if they sell, they get the NAV minus the redemption fee. Typically,
fund shares can be bought or redeemed on a daily basis. While the calculation of the
NAV is theoretically simple, the process of implementing the calculation is not since one
has to accurately record all securities transactions, consider corporate actions, determine
the liabilities, etc.
It is evident that the digitalization of asset management will offer an opportunity
to overcome the present NAV calculation problems. If the NAV can be calculated realtime with powerful technologies, why then should fund shares not be listed on a stock
exchange? Mutual funds as companies pay out almost all of their income - dividend
and realized capital gains - to shareholders every year and pass on all their tax duties
to investors. Hence, mutual funds do not pay corporate taxes. Therefore, the income
of mutual funds is taxed only once while the income of ordinary companies is taxed
twice. The Internal Revenue Service (IRS) defines rules that prevent ordinary firms from
transforming themselves into mutual funds to save taxes: A rule demands for example
297
that mutual funds have only a limited ownership of voting securities and another rule
requires that funds must distribute almost all of their earnings.
4.3.2.2
Organization of Mutual Funds
The funds board of directors is elected by the funds shareholders. It should govern and
oversee the fund such as approval of policies and ensure the funds compliance with regulation, see Figure 4.2. The investment adviser and chief compliance officer perform the
daily management of the fund. Mutual funds are required to have independent directors
on their boards.
An investment adviser, a professional money manager, often sponsors initially (seed
money) the fund. The advisor invests the funds assets in accordance with the funds
investment policy as stated in the registration statement filed with the US Securities and
Exchange Commission (SEC). He or she determines which securities to buy and sell and
he is subject to numerous standards and legal restrictions. The allocation of a funds
assets is permanently monitored and adjusted by the investment adviser.
A funds administrator offers administrative services to the fund such as those related to pay for personnel, providing accounting services, and taking responsibility for
preparing and filing SEC, and other reports.
Investors buy and redeem fund shares either directly or indirectly through the funds
distributor also known as the principal underwriter. The distributor is an independent
network that ensures marketing support.
Mutual funds are required to protect their portfolio securities by placing them with
a custodian. The largest custodians in 2014 were Bank of New York Mellon with USD
28.3 trillion of assets under custody followed by J.P. Morgan (see the Appendix for a list
of assets under custody).
A transfer agent executes authorized transactions, keeps and updates the register
of share units, and issues certification of the issued shares. Mutual funds and their
shareholders also rely on the services of transfer agents to maintain records of shareholder
accounts, to calculate and distribute dividends and capital gains, and to prepare and mail
shareholder account statements, federal income tax information, and other shareholder
notices (ICI factbook (2014)). A mutual fund generally distributes all of its earnings
to shareholders each year and is taxed only on the amounts it retains (Revenue Act of
1936). To qualify for specialized tax treatment under the code, mutual funds must satisfy
several conditions, see above.
Mutual funds make two types of taxable distributions to shareholders: ordinary dividends and capital gains.
298
CHAPTER 4.
Figure 4.2: The organization of a mutual fund (ICI Fact Book [2006]).
4.3.2.3
Taxonomy of Mutual Funds
Money Market (MM) Funds

There are tax-exempt and taxable funds. The former invest in securities backed by
municipal authorities and state governments. Both securities do not pay federal income
tax. Which fund to choose is only a question of the after-tax yield. Tax-exempt funds
make sense for investors who face a high tax bracket. In all other cases, taxable funds
show a better after-tax yield. Fund sponsors typically offer a retail and an institutional
investor series of MM funds.
Bond Funds
Contrary to MM funds, there are many types of bond funds, each defined by different
characteristics. Bond funds can be tax-exempt or taxable. For taxable bonds, the next
characteristic is to distinguish between US and global bonds. In each possible category on
299
these two levels several different factors apply: The creditworthiness of the bond issuers
ranging from high-yield bonds to investment-grade bonds, the maturity of the bonds,
the segmentation of global bonds into emerging market bonds and general global bonds
and the classification of bonds according to different economic sectors or specific topics.
Finally, alternative bond funds use techniques from hedge funds to shape the risk and
return profile.
Morningstar adopted in 2012 a new classification system to overcome the excessive
number of dimensions that a bond fund can have. The system classifies bonds in the
two dimensions creditworthiness (credit quality) and interest-rate sensitivity where each
dimension has three classes such as high/medium or low credit quality and limited/moderate/extensive interest sensitivity. That is, each bond is classified in this 3 3 matrix.
The credit dimension indicates the likelihood that investors will get their invested money
back. The interest-rate sensitivity states the impact of changing interest-rates on the
value of the bonds.
Stock Funds
For stock funds the difference between tax-exempt and taxable does not exist since
most of their income comes from price appreciation and the income from dividends is
very low. As for bond funds, the major categories are US versus global funds. Each fund
is then further classified according to other labels such as sectors, regions, style, etc. As
for bund funds, a 3 3 style box from Morningstar exists with size as one dimension and
style the other. Size is clear. Style can mean value, core, or growth. In the same way as
for bond funds, the classification of a stock fund is based on the funds actual holdings
and not on what was the holding by issuance.
4.3.3
The European Fund Industry - UCITS
Luxembourg attracts different kinds of funds by providing different vehicles with which to
pool their investments. It offers both regulated and non-regulated structures. For regulated fund in Luxembourg, two options are available. First, an undertaking for collective
investment (UCI), a category which itself is divided into UCIs whose securities are distributed to the public and UCIs made up of securities that are reserved for institutional
investors. The most common legal form of UCI is a SICAV (Socit dInvestissement a
Capital Variable) - that is, an open-ended collective investment scheme that is similar
to open-ended mutual funds in the US. A SICAV takes the form of a public limited
company. Its share capital is - as its name suggests - variable and at any time its value
matches the value of the net assets of all the sub-funds. Closed-end funds are referred
to as SICAFs. Second, a Socit dInvestissement en Capital Risque (SICAR). These
provide a complementary regime to that of UCIs. They are tailor-made for private equity
and venture capital investment. There are no investment diversification rules imposed
by law and a SICAR may adopt an open-ended or closed-ended structure.
300
CHAPTER 4.
Both schemes are supervised by the Luxembourg financial sector regulator. A main
reason for Luxembourgs attractiveness is taxation. Both, SICAV and SICAF investment
funds domiciled in Luxembourg are exempt from corporate income tax, capital gains tax,
and withholding tax. They are only liable for subscription tax at a rate of 0.05 percent
on the funds net assets. Also, favorable terms apply with regards to withholding tax.
The UCITS - undertakings for collective investment in transferable securities - directives were introduced in 1985. They comprise the main European framework regulating investment funds. Their principal aim is to allow open-ended collective investment
schemes to operate freely throughout the EU on the basis of a single authorization from
one member state (European Passport). Their second objective is the definition of levels
of investor protection (investment limits, capital organization, disclosure requirements,
asset safe keeping, and fund oversight).
In summary, UCITS funds are open-ended, diversified collective investments in liquid
financial assets and are product passported in 27 EU countries. Total UCITS funds
AuM grew from EUR 3.4 trillion at the end of 2001 to EUR 5.8 trillion by 2010 with
a value of EUR 6.8 trillion at the end 2014. Roughly 85 percent of the European investment fund sectors assets are managed within the UCITS framework. On average,
10 percent of European households invest directly in funds: Germany, 16%; Italy, 11%;
Austria, 11%; France, 10%; Spain, 7%; and the UK, 6%.
There have been five framework initiatives - UCITS I (1985) to UCITS V (2016).
4.3.3.1
UCITS IV and V
Goals of UCITS IV:

Reduce the administration burden by the introduction of a notification procedure.
Increase investor protection by the use of key investor information (KID). KID
replaces the simplified prospectus.
Increase market efficiency by reducing the waiting period for fund distribution
abroad to 10 days.
The Madoff fraud case and the default of Lehman Brothers highlighted some weaknesses
in and lack of harmonization of depositary duties and liabilities across different EU
countries leading to UCITS V (effective March 2016). It considers the following issues.
First, it defines what entities are eligible as depositaries and establishes that they are
subject to capital adequacy requirements, ongoing supervision, prudential regulation and
some other requirements. Second, client money is segregated from the depositarys own
funds. Third, the depositary is confronted with several criteria regarding the holding of
assets. Fourth, remuneration is considered. A substantial proportion of remuneration,
301
for example, and at least 50 percent of variable remuneration, shall consist of units
in the UCITS funds and be deferred over a period that is appropriate in view of the
holding period. Fifth, sanctions shall generally be made public and pecuniary sanctions
for legal and natural persons are defined. Finally, measures are imposed to encourage
whistle-blowing.
4.3.4
Active vs Passive Investments: Methods and Empirical Facts
The simple arithmetic drawn from Bill Sharpe in Chapter 1 (see equation (2.50)) showed
that, before costs, the return on the average actively managed dollar will equal the return
on the average passively managed dollar. The analysis did not tell us whether an active
manager who beats the average is skillful or just lucky. We note that the oldest funds
were passive ones.
We now address this question. The extent of an outperformance of an actively managed fund over its benchmark depends on the fund managers skill and also on various
constraints. Scale for example often impacts performance negatively: A more skillfully
managed large fund can underperform a less skillfully managed small fund.
This interaction between scale and skill is considered in many academic papers. Pastor et al. (2014) empirically analyze the returns-scale relationship in active mutual fund
management. They find strong evidence of decreasing returns at the industry level. A
next result is that active managers have become more skilled over time and that this
upward trend in skill coincides with the industrys growth. Finally, they show that a
funds performance deteriorates over its lifetime.
Leaving the size - skill dependence aside, how can we define and measure skills in
active management?
4.3.4.1
The Success of the Active Strategy
We take a skill degree of the asset managers for granted in this and the next section and
consider the measurement of the true degree of skill of asset managers in the section to
come.
Starting with a toy model, we assume that returns are normally distributed with mean
zero and variance 2 . Profitable trades have by definition a positive return. The expected
return E(R) of one profitable trade equals (positive part of the return distribution)1 .
r
E(R) =
1
E(R) =
1
2 2
R
0
x2
dx.
2
0.8 80% percentile .
302
CHAPTER 4.
Since risk scales with the square root of the number of trades, risk equals for n trades
n. Consider two managers. One manager is always successful; the other is successful
in x% of all trades. Both trade n times. Therefore,
x how well the investor trades.
n how often the investor trades.
The information ratio (IR) - that is, the measure of a managers generated value, measures
the excess return of the active strategy over risk; so
IR =
Excess Return Active Strategy over Benchmark

,
Tracking Error (Active Risk)
(4.1)
where the tracking error is the standard deviation of the active return. For the investor
with 100% success rate, we get
q
r
n 2
2n
=
IR =
n
For the trader with a success rate of x percent, he will face a loss in 1 x percent of
the trades leading to a net profit x (1 x) = 2x 1. Hence, the expected return of n
trades is
r
2
E(R) = (2x 1)n
.
This gives the information ratio

r
2n
IR = (2x 1)
.
(4.2)
For a fixed success rate x an increasing trading frequency n increases the information
ratio. But raising the trading frequency brings about diminishing returns due to the
square-root function. Numerically, an IR of 50 percent needs a success rate x of twothirds if the manager trades quarterly. Hence, a high success rate is necessary to obtain
a moderate IR. In this simple model volatility does not enter the IR and the costs of
trading or rebalancing are not considered. One could extend the analysis to account for
more generality with the obvious impact on the above result.
The skill versus frequency of trading (breadth) trade-off reads qualitatively, see (4.2),
IR
x
n
(4.3)
is of different severity for different asset classes. Many investors in interest rate risk trade
one a monthly or quarterly level since they are exposed to fundamental economic variables. They therefore cannot increase their trading frequency arbitrarily. To achieve a
high IR they need to be very successful. But if markets are efficient, this is not possible,
303
see the efficient market hypothesis discussion. One therefore expects to observe more
skills within (global) asset managers which can exploit inefficiencies between different
markets.
More easier to increase the IR is to increase the breadth. Beside the naive approach
to trade more often other methods are to enlarge the set of eligible assets for the asset managers or to expand the risk dimension by allowing investment strategies which
generate separate risk premia.
4.3.4.2
Fundamental Law of Active Management
Formula (4.1) is one of many formulas to be found in the literature related to skills in
active portfolio management. The most famous formula, the so-called fundamental law
of active management, expressed by Grinold (1989), states:
IR = IC BR = Skill Frequency ,
(4.4)
where IC is the information coefficient of the manager and BR - the strategy breadth
- is the number of independent forecasts of exceptional returns we make per year. The
derivation of (4.4) depends on several assumptions, see the next section. IC measures the
correlation between actual realized returns and predicted returns and gives a managers
forecasting ability. Equation (4.4) states that the investors have to play often (high BR)
and play well (high IC) to win a high IR. The fundamental law (4.4) is additive in the
squared information ratios. Formula (4.1) shows the same intuition: 2x 1 represents
IC and the square-root n represents BR.

Some consequences following Grinold (1989) of (4.4) are:
Combine models, because breadth applies across models as well as assets.
Dont market-time. Such strategies are unlikely to generate high information ratios.
While such strategies can generate very large returns in a particular year, theyre
heavily dependent on luck. On a risk-adjusted basis, the value added will be small.
This will not surprise most institutional managers, who avoid market timing for
just this reason.
Tactical asset allocation has a high skill hurdle. This strategy lies somewhere between market timing and stock picking - it provides some opportunity for breadth,
but not nearly the level available to stock pickers. Therefore, to generate an equivalent information ratio, the tactical asset allocator must demonstrate a higher level
of skill.
How can we map the IR to the quality of a manager? Assuming that active management
is a zero-sum game centered at zero, t hen Table 4.6 relates the IR to the percentiles:
It follows that a top-quartile manager has an IR of one-half and an IR of +1 is exceptional.
304
CHAPTER 4.
Percentile
90
75
50
25
10
IR
1
0.5
0
-0.5
-1
Table 4.6: Percentiles of an IR distribution.
To continue, we restate the definition of the IR of a portfolio given in (4.1) as

IR =
p
Portfolio Alpha
.
=
Portfolio Residual Risk
p
(4.5)
For a portfolio P relative to a benchmark B we have:

2
2p = p2 p2 B
(4.6)
which states in geometric terms that residual risk is the risk of the return orthogonal to
the systematic return. The objective of an active asset manager is to maximize:
E(u) = p 2p .
2
(4.7)
This objective rewards expected residual return and punish residual risk. Replacing the
alpha by the IR using (4.6) implies the optimal level of residual risk by solving the first
order condition:
IR
p =
.
(4.8)
The optimal level of residual risk increases/decreases with the opportunities/residual risk
aversion. Inserting this optimal value implies the maximum expected utility as a function
of the IR and the risk aversion with some obvious implications. Using the fundamental
law we get
IR
IC BR
p =
=
.
(4.9)
The breadth allows for diversification among the active bets and skill increases the possibility of success so that the overall level of aggressiveness can increase.
Example Grinold and Kahn (2000)

Suppose that the manager wants to forecast the direction of the market each quarter.
The market direction takes two values - up and down - only; that is to say, the random
305
variable x(t) = +1 or 1 where the mean is zero and the standard deviation is 1. The
forecast of the manager y(t) takes the same values and has the same mean and standard
deviation as x(t). The information coefficient IC is by definition given by the covariance
of x and y. If the manager makes N bets and is correct N1 times (x = y) and wrong
N N1 times (x = y), then the IC reads
IC =
1
(N1 (N N1 )) .
N
(4.10)
Assume that IC = 0.0577. Then, independent of how largeN is, the success rate N1 is
52.885 percent of the time. While an IC of 0.0577 can lead to an information ratio above
1.0 - that is, the top decile investment manager, the correct forecasting percentage is low.
This shows how little information one needs in order to be highly successful.
The derivation of (4.4) depends on several assumptions. Buckle (2005) reviews the
assumptions. The first assumption is that forecasts are unbiased and residual returns
have zero expected value. If a sample size is small, then this assumption is likely to be
violated ex post. This can hold for a manager with a short history where a measurement of his quality cannot be measured over a full cycle. The second assumption is that
forecasts and their errors are independent. Next, the error covariance matrix is used to
convert forecasts into actual positions and forecasts of returns are normally distributed.
Finally, information coefficients are equal across assets and the information coefficient is
a small number.
The fundamental law of active management is generalized in various academic papers.
Ding (2010) generalizes the law by considering time series dynamics and cross-sectional
properties. He shows that Grinolds formula and several other extensions are special
cases of his own formula. Among other things, Ding shows that cross-sectional ICs are
different from time-series ICs. Also, the IC volatility over time is much more important
for a portfolio IR than breadth: Playing a little better has a stronger impact on the IR
than playing a little more often.
Why is it practically relevant to extend the original law? The theoretically calculated
IR number in (4.4) seems always to overestimate the IR a portfolio manager can reach.
Assume a forecast signal with an average monthly IC of 0.03 and a stock universe of
1, 000, Then, the expected annualized IR from (4.4) is 3.29. This is beyond what the
best portfolio managers can realize.
Ding shows in the time-series case that (4.4) only holds under the assumption that
the time-series ICs are the same across all the securities and the common IC is small. He
proves
IC
IR = p
BR .
(4.11)
1 IC2
306
CHAPTER 4.
For a small IC and if the time series of the IC is the same across all the securities, (4.11)
is approximatively the same as (4.4).
4.3.4.3
Skill and Luck in Mutual Fund Management
The approach so far has not addressed the problem of how one can distinguish between
skill and luck and we still do not know how skillful the mutual fund industry is.
Example
Peter Lynch, the manager of the Magellan fund, exhibited statistically significant
abnormal performance. Lynch beat the S&P 500 in 11 of the 13years from 1977 to 1989.
But this is itself not evidence of value enhancement. Consider 500 coin-flippers. Each
flips 13 coins and we count the number of heads for each flipper. The winner, on average,
flips 11.63 heads. But Lynch also beats the S&P in the amount. The fund Magellan
returned 28 percent p.a. vs 17.5 percent for the S&P.
How many fund managers possess true stock-picking skills, and how can we find them
in the cross-sectional alpha distribution. Scailliet et al. (2013) develop a simple technique
that controls for false discoveries or mutual funds that exhibit significant alphas by luck
alone. They estimate the proportions of unskilled, zero-alpha, and skilled funds in the
population. A fund is unskilled if the return from stock picking is smaller than the costs
(alpha is negative net of trading costs and expenses), a zero-alpha fund if the difference
is zero, and a skilled fund otherwise (alpha is strictly positive).
The statistical test is based on the false discovery rate (FDR), see also Chapter 3.
We consider the distribution function for the three groups unskilled, zero-alpha, and
skilled funds. Grouping the three distribution functions as a function of the t-statistics,
we have three density functions with the zero-alpha group density function in the middle,
see Figure 4.3. Two density functions then overlap - unskilled overlaps with zero-alpha
and zero-alpha with skilled. Pick the latter region of overlap. If a fund shows a high
enough t-value, which is necessary to be in this overlap, then if the fund belongs to the
group of zero-alpha funds, the probability of this fund having this high t-value is driven
by luck. Therefore, in the cross-section distribution of all funds, some funds with high
t-values are genuinely skilled and others are merely lucky.
Of course, it is not possible to observe the true alphas for each fund. The inference
for the three skill groups is carried out as follows. First, for each fund, the alpha and its
standard deviation are estimated. The ratio of the two estimates defines the t-statistic.
Then, choosing a significance level, the t-estimate lies within or outside the threshold
implied by the significance level. Estimates outside are labeled significant.
307
Figure 4.3: Intuition about luck and skill for the three groups of mutual funds unskilled,
zero-alpha and skilled. (Scaillet et al. [2013]).
The FDR measures the proportion of lucky funds among the funds with significant
estimated alphas. We recall that the FDR are very easy to compute from the estimated
p-values of fund alphas. The data set are monthly returns of 2, 076 actively managed US
open-end, domestic equity mutual funds that existed at any time between 1975 and 2006
(inclusive).
Of the funds, 75.4 percent are zero-alpha, 24.0 percent are unskilled, and 0.6 percent
are skilled. Unskilled funds underperform for long time periods. Aggressive growth funds
have the highest proportion of skilled managers, while none of the growth and income
funds exhibit skills.
During the period 1990-2006, the proportion of skilled funds decreases from 14.4 to
0.6 percent, while the proportion of unskilled funds increases from 9.2 percent to 24.0
percent, see Figure 4.4. Although the number of actively managed funds increases over
this period, skilled managers have become exceptionally rare. The figure illustrates the
demanding task for active asset management since an investor could state that skills
in active management are monotonically decreasing over time and that after costs an
308
CHAPTER 4.
average alpha of -1% follows in 2016. Hence, this chart seems to be a good motivation
for passive investments. Such a view fails in several respects or falls short to explain
the evolution of different characteristics shown in the figure. First, the education level
of the average asset manager is clearly higher than it was 20 years ago. But then, why
is the alpha decreasing? After the peak in 1993 when the alpha started to decline, the
internet was launched. The cost of information started to decrease over time. Therefore
markets became more and more efficient. The increase 1999-2000 is due to the events
at this time which produced a lot of uncertainty and investment opportunities. A second reason is the simple logic of the Bill Shape arithmetics that active investment is
a zero sum game on average before costs. Therefore even the education level increases,
the winners and losers return must be still be distributed around the passive or market
return. But skills not only increased on average in the last decades; it is plausible that
they increased rather uniformly due to the similarity of the many educations in portfolio or asset management. Therefore, the luck component is becoming more and more
important for the manager whether his performance is above or below the average. But
luck is not persistent which then leads to an overall decreasing alpha of the industry.
The possibility that funds may lose their outperformance skills due to their increasing
size, the authors further tests over five-year subintervals. They treat each five-year fund
record as a separate fund and find that the proportion of skilled funds equals 2.4 percent, implying that a small number of managers have hot hands over short time periods.
Other explanations of the paradox - increasing skills, decreasing costs leading to a
low proportion of skilled managed funds of 0.6 percent which is statistically indistinguishable from zero - are the movement of skilled managers to the hedge funds and the
possibility that markets become more efficient over the period. First, the hedge fund used
performance-based fees which insure that skilled managers will be handsomely compensated. By contrast, very few mutual funds utilize performance-based fees. This is a
strong monetary incentive for skilled mutual fund asset managers to move to the hedge
funds industry. But then, a similar FDR analysis applied to hedge funds should deliver
the respective results, see below for the analysis. Such an analysis about hedge funds or
managed accounts could also provide an answer whether or not markets are becoming
more efficient: If similar decay is measured as for mutual funds this would support the
hypothesis that the market has become more efficient.
Skilled funds are concentrated in the extreme right tail of the estimated alpha distribution. This suggests a way to detect them. If in a year tests indicate higher proportions
of lucky, zero-alpha funds in the right tail, then the goal is to eliminate these false discoveries by moving further to the extreme tail. Carrying out this control each year, they
find a significant annual four-factor alpha of 1.45 percent. They also find that all outperforming funds waste, through operational inefficiencies, the entire surplus created by
their portfolio managers.
The authors re-examine the relation between fund performance and turnover, expense
309
Figure 4.4: Proportion of unskilled and skilled funds (Panel A) and total number of
mutual funds in the US versus average alpha (Scaillet et al. [2013]).
ratio, and size. For each characteristic, the proportion of zero-alpha funds is around 75%.
The proportion of unskilled funds is qualitatively larger for funds with high turnover many unskilled funds trade on noise to pretend that they are skilled. The size of the
fund has a bipolar effect: Both the proportion of unskilled and skilled funds are larger
than for smaller funds.
What about European funds? Scaillet (2015) considers 939 open-end funds between
2001 and 2006. The main findings are first, the proportion of zero-alpha funds is 72.2 percent, the proportion of skilled funds is 1.8 percent, and the proportion of unskilled funds
is 26 percent. Second, in skilled funds, we find low betas with respect to MSCI Europe.
Some skilled funds are known to play bonds and depart from their pure equity mandates.
Figure 4.5 finally represents the hall of fame of successful investors which prove to
outperform the S&P500 for at least more than 10 years The only persistent quantitatively managed investments from Renaissance is based on top secrecy about the used
methods and the hiring of top scientists from the natural and IT sciences which apply
non-traditional economics algorithms. It is interesting to note that using these characteristics only one manager of the alternative investment group is listed in the hall of fame.
310
CHAPTER 4.
Furthermore, it is notable that the macro investors dominate the fundamental investors
which cannot be grouped to the Buffet/Graham school. Finally, the appearance of Lord
Keynes shows that it was possible to successfully outperform the US markets in days
where technology was in a state of infancy but instead relying on deep understanding of
the macro economy.
Figure 4.5: Hall of Fame of investors (gurufocs, Hens and FuW [2014]).
4.3.5
4.3.5.1
Fees for Mutual Funds

Definitions
The SEC (2008) document lists and defines the following components for mutual fund
fees. (i) fees paid by the fund out of fund assets to cover the costs of marketing and selling fund shares and sometimes to cover the costs of providing shareholder services. (ii)
distribution fees, including fees that compensate brokers and others who sell fund shares
and that pay for advertising, the printing and mailing of prospectuses to new investors,
and the printing and mailing of sales literature. (iii) shareholder service fees - fees paid
to persons who respond to investor inquiries and who provide investors with information
about their investments.
311
The expense ratio is the funds total annual operating expenses including management
fees, distribution (12b-1) fees and other expenses. All fees are expressed as a percentage
of average net assets. Other fees include fees related to the selling and purchasing of
funds: Back-end sales load is a sales charge investors pay when they redeem mutual
funds. Front-end sales is the similar fee when funds are bought. It is generally used by
the fund to compensate brokers. Purchase and redemption fees are not the same as the
back- and front-end sales. They are both paid to the fund. The SEC generally limits
redemption fees to 2 percent.
4.3.5.2
Share Classes
Different stock classes are used to express different voting rights. Different mutual fund
classes are used for different customers and different fees. The most prominent classes in
the US are the A-, B- and C-class.
Class-A shares for example charge a front-end load and have low 12b-1 fees. They
are therefore beneficial for long run investors.
In Europe the type of share classes can define the client segmentation, specify investment amount and specify the investment strategy. For example:
AA-class: Admissible for all investors, distribution of earnings.
AT-class: Admissible for all investors, blow back of earnings (thesaurieren).
CA-class: Admissible for qualified investors only, distribution of earnings.
D-class: Same as CA but blow back of earnings.
N-class: Only for clients which possess a mandate contract or an investment contract with the bank.
4.3.5.3
Net Asset Value (NAV)
We consider transaction and the fees included in the funds expense ratio, see Section
4.3.5 for the latter one.
An important figure is the total expense ratio (TER). This is a percentage ratio defined as the ratio between total business expenses and the average net fund value. TER
expresses the total of costs and fees that are continuously charged. Business expenses are
fees for the funds board of directors, the asset manager, the custodian bank, administration, distribution, marketing, the calculation agent, audit, and legal and tax authorities.
The following approach is widely used for performance calculations. Consider a period
starting at 0 with length T . The performance P is defined by:
P% =
NAVT f1 . . . fT
100
NAV0
(4.12)
312
CHAPTER 4.
with f the adjustment factor for the payout, such as dividends,

f=
NAVex + BA
,
NAVex
with BA the gross payout - that is to say, the gross amount of the earning- and capitalgain payout per unit share to the investors, and NAVex the NAV after the payout.
Example
Consider a NAV at year-end 2005 of CHF 500 million, 2006 earnings of CHF 10
million, and a capital-gain payout of CHF 14 million. The NAV after payments is CHF
490 million and the NAV at the end of 2006 is CHF 515 million. The adjustment factor
is
490 + 10 + 14
f=
= 1.04898.
490
This gives the performance for 2006

515 1.04898
P =
1 = 8.045%.
500
There are several reasons why it is important to measure the performance of a fund
correctly. First, one wants to select the best fund, second one wants to check whether the
fund stocks do what they promise and finally, a correctly measured performance allows
one to check whether the fund manager added value.
The performance formula (4.12) can be rewritten in the effective return form
(1 + P )NAV0 = NAVT f1 . . . fT
T
Y
1+
k=1
BAk
NAVex,k

.
(4.13)
If the gross payouts are zero in all periods, then the performance reads
(1 + P )NAV0 = NAVT
with P the simple effective return. Contrarily, assume that in each period a constant
BA
fraction g = NAV
is paid out. Then,
ex
(1 + P )NAV0 = NAVT (1 + g)T .
Since (1 + g)T is larger than one, the interpretation is as follows: with the same effective
return P the fund achieves a larger final effective value NAVT (1 + g)T than the fund
without any payouts and the same P .
4.4. INDEX FUNDS AND ETFS
313
Example
The return calculation for funds can be misleading. Consider the following reported
annual returns: 5%, 10%, 10%, 25%, 5%. The arithmetic mean is 7%. The geometric
mean is 6.41%. How much would an investor earn after 5 years if he or she starts with
USD 100?
100 1.05 1.1 0.9 1.25 1.05 = USD136.4.
If the fund reports the arithmetic mean, the investor would expect
100 1.075 = USD140.2.
Using the geometric mean of 6.41%, the true value of USD 136.4 follows. Although it
is tempting to report the higher arithmetic mean, such a report would be misleading.
Some jurisdictions require funds to report returns in the correct geometric way.
4.4
Index Funds and ETFs
The work of Fama on market efficiency was one reason for the rise in the 70s of last
century of low-cost and passively managed investing through index funds. Another theoretical milestone in the development of passive management was established by Jensens
(1968) work about the performance of 115 equity mutual funds:
The evidence on mutual fund performance indicates not only that these 115 mutual
funds were on average not able to predict security prices well enough to outperform a buythe-market-and-hold policy, but also that there is very little evidence that any individual
fund was able to do significantly better than that which we expected from mere random
chance.
An growth analysis of the top ten global asset managers over the past five years
confirms this trend. Vanguard with its emphasis on passive products is the strongest
growing AM, followed by BlackRock with its passive products forming the iShares family.
Both index funds and ETF aim at replicating the performance of their benchmark
indices as closely as possible. Issuers and exchanges set forth the diversification opportunities they provide - like mutual funds - to all types of investors at a lower cost as
for mutual funds, but also highlight their tax efficiency, transparency, and low management fees. Although actively managed ETFs were first launched also around twenty
years ago their importance remains negligible. One major reason is that actively managed ETFs lose their cost advantage compared to mutual funds. As of June 2012 about
1, 200index ETFs existed in the US, including only about 50 that were actively managed.
314
CHAPTER 4.
Example Core-satellite
Core-satellite approaches are common in many investment processes. They comprise
a core of long-term investments with a periphery of more specialist or shorter-term investments. The core is then a passive investment style where index funds or ETFs are
used to implement the passive strategy at low costs (see the following sections for index
funds and ETFs). Satellites are, conversely, often actively managed and the hope is that
they are only weakly correlated with the core.
4.4.1
Capital Weighted Index Funds
Index funds are used to gain access to (global) diversified equity market performance.
Traditionally, these indices are constructed using capitalization weights (CWs). In recent
years, new types of weights have been considered. These alternative methods are often
called smart beta approaches. The rationale for CW is the CAPM: all investors hold
the CW market portfolio. The second theoretical input is the efficient market hypothesis
(EMH). These two theoretical streams were the foundation for cost effective, passive investment in CW instruments: McQuown developed the first index fund - at Wells Fargo
- in 1970.
One must distinguish between the theoretical index and a strategy that replicates
the theoretical index using securities. The theoretical index is not an investable asset or
security. If we set i,t for the weight of asset i in the index at time t, with Ri,t the gross
return of the asset in the period t 1 to t, the index value It satisfies the dynamics
It = It1 (
N
X
k,t Rk,t ) .
(4.14)
k=1
The value of the index tomorrow is equal to the present value times the return of
each stock generated until tomorrow weighted by the asset weight. The index fund
Ft aims to replicate (4.14) by investing in the stocks. At each date t the fund has a
number nk,t of stocks k and F is equal to the sum of all stocks times their price Pk,t .
Obviously, one can relate the relative weights and the absolute weights in a one-to-one
form. The difference between the values Rt and It is the tracking error. If there is a
perfect replication of the theoretical value by the index fund, the tracking error is zero.
But there are situations where a full replication is either too expensive or not feasible.
The accuracy of the replication is often measured with the volatility of the tracking error.
Example
315
The tracking error (TE) can be calculated directly or indirectly. Consider the following returns for a portfolio and its benchmark (market portfolio).
Period [month]
1
2
3
4
5
6
7
8
9
10
11
12
1y = 12
Portfolio
0.37%
-1.15%
-1.81%
-0.04%
-1.22%
0.08%
1.18%
-0.52%
1.83%
-0.70%
-0.66%
-1.60%
1.10%
3.80%
Market
0.53%
-1.36%
-1.43%
-0.34%
-1.59%
-0.30%
1.12%
-0.39%
1.94%
-0.36%
-0.60%
-1.85%
1.14%
3.93%
Return difference
-0.16%
0.21%
-0.38%
0.30%
0.37%
0.37%
0.07%
-0.13%
- 0.11%
-0.33%
-0.06%
0.25%
0.27%
0.92%
Table 4.7: Direct tracking error calculation. The TE is 0.92%
The indirect method uses the following replication of the tracking error. The TE is
equal to buying the portfolio and selling the benchmark. This implies that we can use
the general variance formula for two random variables and choocing the weights 1 = +1
and 2 = 1, then the variance becomes
2 = 12 + 22 21 2 .
The TE is then equal to . The covariance of the two time series is 0.011 percent.
Dividing by the volatilities of the two time series the correlation factor = 0.89 follows.
This then gives the TE per period and scaling it with the square root law the annualized
TE of 0.92% follows, which is the same as that calculated with the direct method.
Example
This example follows ZKB (2013). Examples of capital-weighted indices include
the S&P 500, FTSO, MSCI, and SMI. Other indices use equal weighting (EW). Dow
Jones 30 and Nikkei 225 are both equally weighted indices. Other types include share
weighting and attribute weighting. In attribute weighting the weights are chosen
316
CHAPTER 4.
according to their ranking score in the selection process. If our ranking is based on
ethical and environmental criteria, and asset Y has a score of 75 and asset X of 25, then
the weight ratio between asset Y and X will be 3.
The divisor is a crucial part of the index calculation. At initiation it is used for
normalizing the index value. The initial SMI divisor in June 1998 was chosen as a value
that normalized the index to 1, 500. However, the main role of the divisor is to remove the
unwanted effects of corporate actions and index member changes on the index value. It
ensures continuity in the index value in the sense that the change in the index should only
stem from investor sentiment and not originate from synthetic changes. The impact of
corporate actions depends on the weighting scheme used for the index. Consider a stock
split for an index with:
Market capitalization weighting - The price of the stock will be reduced and the
number of free floating shares increases. These two effects will be offsetting and no
change has to be made to the divisor.
Equal weighting (price weighting) - The stock price reduction will have an effect,
but the number of free-floating shares has no impact on such a weighting. Therefore,
the divisor has to be changed to a lower value in order to avoid a discontinuity in the
index value. How the dividends are handled in the index calculation determines
the return type of the index. There are three versions of how dividends can be
incorporated into the index value calculations:
Price return index - No consideration is taken of the dividend amount paid out
by the assets. The day-to-day change in the index value reflects the change in the
asset prices.
Total return index - The full amount for the dividend payments is reflected in the
index value. This is done by adding the dividend amount on the ex-dividend date
to the asset price. Thus, the index value acts as if all the dividend payments were
reinvested in the index.
Total return index after tax - the dividend amount used in the index calculation is
the after tax amount; that is to say, the net cash amount.
The relative weights are, for a CW index, defined by

Mk,t Pk,t
k,t = PN
j=1 Mj,t Pj,t
(4.15)
with M the number of outstanding shares. The numerator is the market capitalization
of stock k and the denominator is the market capitalization of the index. The weights
317
can change as follows, where we write MC for the index market capitalization:
Mk,t Pk,t Pk,t Mk,t Mk,t Pk,t M C
.
(4.16)
+
k,t =
MC
MC
(M C)2
The three possible changes of the weights reflect the changes in the outstanding shares,
price changes or changes in the index market capitalization. The second change is the
most important. The two others are more constant in nature. If the market shares are
constant over time, the same holds true for the number of shares N that are needed
to construct the fund. This is one of the main reasons why CW is often used: the
constancy of the shares implies low trading costs. This reason and the simplicity of the
CW approach have made it the favorite index construction method.
4.4.2
Risk Weighted Index Funds
There are reasons why one searches for alternatives to the CW approach: The rejection
of the CAPM and a critique resulting from the trend-following strategy of a CW. Suppose that one single stock in the CW index formula (4.15) is outperforming all others at
a very high rate. Then, the weights will be concentrated over time in this single stock.
Hence, diversification is lost and the index construction turned into a concentration of
idiosyncratic risk with the respective large drawdown risk of such a construction.
Alternative weighting schemes - smart beta approaches - weight the indices not by
their capital weights but either by other weights, which should measure the economic
size of companies better (fundamental indexation), or by risk-based indexation. At first
glance, alternative weighting schemes should perform better than the CW scheme. But
most often, investors will use a mixture of CW and alternative schemes. A first requirement for such a mix is that the two approaches show a low correlation. Fundamental
indexation serves the purpose of generating alpha to dominate the CW approach while
risk-based constructions focus on diversification.
One example of risk-based indexation is the equally weighted (EW) approach. This
is a natural choice if predictions of risk are not possible at all or are flawed by large
uncertainty. The choice of the minimum variance portfolio (MV) is a second type of riskbased indexation. Other approaches, which follow from risk parity modelling, include
the most diversified portfolio (MDP) and the equal risk contribution (ERC) portfolio.
Roncalli (2014) compares the different methods for the Euro Stoxx 50 index using data
from December 31, 1992, to September 28, 2012. He computes the empirical covariance
matrix using daily return and a one-year, rolling window; rebalancing takes place on the
first trading date of each month and all risk-based indices are computed daily as a price
index.
4.4.3
ETFs
Exchange traded funds (ETFs) are a mixture of open- and closed-end funds. The main
source is Deville (2007). They are hybrid instruments which combine the advantages of
318
CHAPTER 4.
Expected return p.a.

Volatility
Sharpe ratio
Information ratio
Max. drawdown
CW
4.47
22.86
0.05
-66.88
EW
6.92
23.05
0.16
0.56
-61.67

MV
7.36
17.57
0.23
0.19
-56.04
MDP
10.15
20.12
0.34
0.42
-50.21
ERC
8.13
21.13
0.23
0.62
-56.85
Table 4.8: Statistics for the different index constructions of the Euro Stoxx 50. CW is
capital weighting, EW is equal weighting, MV is mean-variance optimal, MDP is most
diversified portfolio, and ERC is equal risk contribution (Roncalli [2014]).
both fund types. Mutual funds must buy back their units for cash, with the disadvantage
that investors can only trade once a day at the NAV computed after the close. Furthermore, the trustee needs to keep a fraction of the portfolio invested in cash to meet the
possible redemption outflows. Closed-end funds avoid this cash problem. Since it is not
possible to create or redeem fund shares, there is no possibility to react to changes in
demand for the shares. Therefore, if there are strong shifts in demand, price reactions
follow such as significant premiums or discounts with respect to their NAV. ETF trade on
the stock market on a continuous basis where shares can be created or redeemed directly
from the fund. The efficiency of the ETF trading system relies on the in-kind creation
and redemption process.
The in-kind process idea is due to Nathan Most. ETFs are organized as commodity
warehouse receipts with the physicals delivered and stored, whereas only the receipts
are traded, although holders of the receipt can take delivery. This in-kind - securities
are traded for securities - creation and redemption principle has been extended from
commodities to stock baskets, see Figure 4.6.
Figure 4.6 illustrates the dual structure of the ETF trading process with a primary
market open to institutional investors (AP) for the creation and redemption of ETF
shares directly from the fund. The ETF shares are traded on a secondary market. The
performance earned by an investor who creates new shares and redeems them later is
equal to the index return less fees even if the composition of the index has changed in
the meantime. Only authorized participants can create new shares of specified minimal
amounts (creation units). They deposit the respective stock basket plus an amount of
cash into the fund and receive the corresponding number of shares in return. ETF share
are not individually redeemable. Investors who want to redeem are offered the portfolio of
stocks that make up the underlying index plus a cash amount in return for creation units.
Since ETFs are negotiated on two markets - primary and secondary market - it has
two prices: the NAV of the shares in the primary market and their market price in the
secondary market. These two prices may deviate from each other if there is a pressure
to sell or buy. The in-kind creation and redemption helps market makers to absorb
such liquidity shocks on the secondary market, either by redeeming outstanding or by
319
Primary
market
Creation
Stock basket + cash

in return ETF shares
Redemption
Institutional
investors
ETF pponsor / fund
ETF shares in return

for stock basket + cash
Cash
Stock market
Stocks
Buy / Sell
Institutional and
retail investors
Autorized participants /
Market makers
Cash
Buyers
ETF shares
Cash
Exchange
Sellers
ETF shares
Secondary
market
Figure 4.6: Primary and secondary ETF market structure where the in-kind process for
the creation and redemption of ETF shares is showsn. Market makers and institutional
investors can deposit the stock basket underlying an index with the fund trustee and
receive fund shares in return. These created shares can be traded on an exchange as
simple stocks or later redeemed for the stock basket then making up the underlying
index. Market makers purchase the basket of securities that replicate the ETF index
and deliver them to the ETF sponsor. In exchange each market maker receives ETF
creation units (50,000 or multiples thereof). The transaction between the market maker
and the ETF sponsor takes places in the primary market. Investors who buy and sell the
ETF then trade in the secondary market through brokers on exchanges. (Adapted from
Deville [2007] and Ramaswamy [2011]).
creating shares. It also ensures that departures between the two prices are not too large
since authorized participants in the primary market could arbitrage any sizable differences between the ETF and the underlying index component stocks. If the secondary
market price is below the NAV, APs could buy cheap ETFs in the secondary market,
take on a short position in the underlying index stocks and, then ask the fund manager
to redeem the ETFs for the stock basket before closing the short position at a profit.
Furthermore, since ETF fund manager do not need to sell any stocks on the exchange
to meet redemptions, they can fully invest their portfolio and the creations do not yield
any additional costly trading within the fund. Finally, in the US, in-kind operations
are a nontaxable event.
Most ETFs track an index and are passively managed. ETFs generally provide diver-
320
CHAPTER 4.
sification, low expense ratios, and the tax efficiency of index funds, while still maintaining
all the features of ordinary stock, such as limit orders, short selling, and options. ETFs
can be used as a long-term investment for asset allocation purposes and also to implement
market-timing investment strategies. All of these features rely on the above described
specific in-kind creation and redemption principle. Leveraged ETFs or inverse leveraged ETFs use derivatives to seek a return that corresponds to a multiple of the daily
performance of the index (see below).
The costs of an ETF have two components: transaction costs and total expense ratio
(TER). Transaction costs are divided into explicit and implicit costs. Explicit transaction
costs include fees, charges, and taxes for the settlement by the bank and the exchange.
Implied costs are bid-ask spreads and costs incurred due to adverse market movements.
Some facts about ETFs:
Originators. ETF are constructed by index providers, exchanges, or index fund
managers.
Pricing. The market price of an ETF may be at a discount or premium to its NAV.
The difference is limited due to the in-kind process.
Clients are mutual funds, hedge funds, institutions, or private banks clients.
ETF construction techniques. ETFs can be constructed by direct replication or
by using swap-backed construction. One buys all index components (full physical
replication) or an optimized sample in direct replication. This is a transparent approach with low counterparty risk (which occurs due to securities lending). Physical
replication can be expensive for tracking broad emerging market equity or fixed income indices. Commodities ETFs and leveraged ETFs not necessarily employ full
replication because the physical assets are either difficult to store or to leverage.
Referring only to a subset of the underlying index securities for physical replication
leads to a significant tracking error in returns between the ETF and the index. In
a swap-backed construction, the performance of a basket is exchanged between the
ETF and the swap counterparty.
Trends in ETF investment arise from regulation and investors desire. From a regulatory perspective there has been barriers for active managers due to regulations by
Retail Distribution Review (RDR) in UK and MiFID II in the euro zone. But growth
in passive strategies will also be driven by cost transparency and the search for cheap
investments. But also new uses for ETFs will emerge. Institutions will use them to get
access to specific asset class or geographic exposures and retail investors will invest in
ETFs as a lower-cost alternative to mutual funds and UCITS funds. Finally, trends in
the last year are to construct ETF not on an CW basis but on a risk weighted one using
risk parity methods and to focus on risk factors instead of asset classes as underlying
instruments.

4.4.3.1
321
Unfunded Swap-Based Approach
In the swap-based approach one invests indirectly in a basket by achieving the index performance via a total return swap (TRS), see Figure 4.7. The ETF sponsor pays cash to
the swap counterparty and indicates which index should matter for the ETF. The swap
counterparty is often the parent bank of the ETF sponsor, more specifically the banks
investment banking unit. The TRS swaps the index return against a basket return that is to say, the ETF sponsor receives the desired index return needed for the ETF
and delivers a basket return to the swap counterparty. The basket should be close to
the index; the closer it is the lower is the tracking error borne by the swap counterparty.
The swap counterparty delivers a basket of securities to the ETF sponsor as collateral
for the cash paid.
This approach minimizes the tracking error for the ETF investor and enables more
underlyings to be accessed. The basket of securities used as collateral is typically not
related to the basket delivered to the swap counterparty, which mimics the index. Why
should an investment bank, as swap counterparty, enter into such a contract? To answer
this we consider a stylized example.
Example
Assume that three securities - S1 , S2 , and S3 - make up an Index I. The weights of
S1 and S2 are each 48 percent, and S3 only contributes 4 percent to the index. The ETF
sponsor delivers the basket consisting of assets S1 and S2 only to the swap counterparty.
The missing S3 -asset is the tracking error source. The swap counterparty (say an investment bank (IB)) delivers to the ETF sponsor seven securities, C1 ,..., C7 , as collateral.
These assets are in the inventory of the IB due either to its market-making activities
or to the issuance of derivatives: The IB has to keep the securities because of business
that is not related to ETFs. When these securities Ci are less liquid, they will have to
be funded either in unsecured markets or in repo markets with deep haircuts. The IB
has, for example, to pay120 percent for a security Ci that is worth only 100 percent at
a given date. Transferring these securities to the ETF sponsor, the IB may benefit from
reduced warehousing costs for these assets. Part of these cost savings may then be passed
on to the ETF investors through a lower total expense ratio for the fund holdings. The
cost savings accruing to the investment banking activities can be directly linked to the
quality of the collateral assets transferred to the ETF sponsor. A second possible benefit
for the IB is lower regulatory and internal economic capital requirements: the regulatory
charge for less liquid securities Ci is larger than for the more liquid securities S1 and
S2 in the basket delivered by the ETF sponsor. Summarizing, a synthetic swap has a
positive impact on the security inventory costs of the IB due to non-ETF business and
regulatory capital and internal economic risk capital charges.
The drawbacks to synthetic swaps are counterparty risk and documentation require-
322
CHAPTER 4.
ments (International Swaps and Derivatives Association [ISDA]). Although synthetic

ETFs are fully collateralized by their counterparties.
Figure 4.7: Unfunded swap ETF structure (Ramaswamy [2011]).
4.4.3.2
ETFs for Different Asset Classes
The first and most popular ETFs track broad stock indices, sector indices or specific niche
areas like green power. The evolution of ETFs by region between 2010 and 2013 (World
Federation of Echanges [2014]) shows the dominance of the Americas with around 90%
of the traded ETF volumes, followed by Asia and Europe, both with 5% and 6%. The
size in Europe declined in the period whereas the size in Asia doubled. The worldwide
ETF assets in USD bn were 9.670 in 2010 and 11, 893 in 2013.
Bond ETFs face typically face huge demand when stock markets are weak such as
when recessions occur. An asset rotation from stocks to bonds is often observed in such
cases. Figure 4.8 shows bond inflows of USD 800 billion and equity redemption in longonly equities (LO equities) after the GFC. In the last years an opposite rotation began
due to close-to-zero interest rates.
Commodity ETFs invest in oil, precious metals, agricultural products, etc. The idea
of a gold ETF was conceptualized in India in 2002. At the end of 2012 the SPDR Gold
Shares ETF was the second-largest ETF. Rydex Investments launched 2005 the first currency ETF. These funds are total return products where the investor gets access to the
FX spot change, local institutional interest rates, and a collateral yield.
323
Figure 4.8: Bond inflows and equity redemptions (BoA Merill Lynch Global Investment
Strategy, EPFR Global [2013]).
Actively managed ETFs were offered in the United States since 2008. Initially, they
grew faster than index ETFs did in their three years. But the growth rate was not
sustainable: The number of actively managed ETFs is not growing since several years.
Many academic studies question the value of active ETF management at all since they
face the same skill and luck issue as mutual fund.
4.4.3.3
Leveraged ETFs (LETFs)
Leveraged exchange traded funds (LETFs) require financial engineering techniques in

their construction and the life cycle management to achieve the desired return. Trading future contracts is a common way to construct leveraged ETFs. Rebalancing and
re-indexing of LETFs can be costly in turbulent markets. LETFs deliver multiples of
a benchmarks return on a daily basis. This can be profits or losses for an investor.
Several empirical studies show that LETFs deviate significantly from their underlying
benchmark. This tracking error has to causes - a compounding effect and a rebalancing
effect. Other factors such as fees or taxes are negligible.
The compounding effect follows from LETF keeping a fixed exposure to the underlying
index mechanically. This mechanics results in a computable compounding deviation.
324
CHAPTER 4.
Example
To understand these results, consider a LETF with positive leverage factor 2 (bullish
leverage). We follow Dobi and Avellaneda (2012). There are three time periods 0, 1, and2
in the example (see Table 4.9). The index value of the ETF starts at 100, loses 10 percent,
and then gains 10 percent.
Time Grid
Index Value
AuM
TRS exposure needed
Notional TRS
Exposure adjustment
t0
100
1,000
2,000
2,000
0
t1
90
800
1,600
1,800
-
t1+
1,6000
-200
t2
99
960
1,920
1,760
-
t2+
1,920
+160
Table 4.9: Data for the leveraged ETF example. tk, denotes the time tk before adjustment of the TRS and tk, after the adjustment of TRS.
The initial AuM is USD 1, 000 at day 0, and the AuM is USD 800 at day 1. The 10
percent drop on day 1 implies
USD800 = 1, 000(1 2 0.1).
This implies a required exposure of 2 800 = USD1, 600. The notional value of the TRS
from day 0 has become, at day 1,
USD2, 000 (1 0.1) = 1, 800.
This is the exposure before adjustment. Since the exposure needed at day 1 is USD 1, 600,
the swap counterparty must sell (short the synthetic stock) USD 200 = 1, 800 1, 600
of TRS. Doing the same calculation for day 2, the AuM is USD 960 and the exposure
needed is USD 1, 920 at day 2. Similarly, on day 2 the swap counterparty must buy a
TRS amount of USD 160 = 1, 920 1, 760, where USD 1, 760 = 1, 600 (1 + 0.1) is the
exposure before adjustment.
Example
We consider the compounding problem for a LETF. Fix an index and a two-time
LETF, both beginning at 100. Assume that the index first rises 10% to 110 and then
drops back to 100, a drop of 9.09%. The LETF will first rise 20% and then drop 18.18% =
325
2 9.09%. But 18.18%120 = 21.82. Therefore, while the index has value 100, the LETF
is at 98.18. which implies a loss of 1.82%. Such losses always occur for LETF when the
underlying index value changes direction. The more frequent such directional changes
are - hence it is a volatility effect - the more pronounced the losses.
These examples illustrates that a LETF always rebalances in the same direction as
the underlying index, regardless of whether the LETF is a bullish one (positive leverage)
or bearish one (negative leverage). The fund always buys high and sells low in order to
maintain a constant leverage factor. A similar analysis holds for inverse leveraged ETFs.
4.4.4
Evolution of Expense Ratios for Actively Managed Funds, Index

Funds and ETFs
Figure 4.9 shows the evolution of expense ratios for actively managed funds and index
funds.
Expense Ratios of Actively Managed and Index Funds, bps
p.a.
120
100
80
60
40
20
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Actively managed bond funds
Index bond funds
Actively managed equity funds
Index equity funds
2011
2012
2013
Figure 4.9: Expense ratios of actively managed (upper lines) and index funds (lower
lines) - bps p.a. (Investment Company Institute and Lipper [2014]).
The trend of decreasing fees continues. But for the index funds a bottom level seems
to be close. Table 4.10 also considers ETF fees.
326
CHAPTER 4.
Mutual funds (*)

Index funds (*)
ETFs (**, ])
ETF core (**,+)
Equity
0.74%
0.12%
0.49%
0.09%
Bonds
0.61%
0.11%
0.25%
0.09%
Table 4.10: Fees p.a. in bps in 2013 ((*) Investment Company Institute, Lipper; (**)
DB Tracker; (]) Barclays; (+) BlackRock).
4.5
Alternative Investments (AIs) - Insurance-Linked Investments
It is estimated that alternative investments will reach to USD 13 trillion by 2020 up from
USD 6.9 trillion in 2014. One expects that more and more investors can access AIs as
regulators begin to allow them access to specific regulated vehicles such as alternative
UCITS funds in Europe and alternative mutual funds in the US. AI will therefore become more prominent both in institutional and retail portfolios. Regulation will apply
to alternative investments in the same way as for traditional ones.
But what are alternative investments (AIs)? They are often defined as investments
in asset classes other than stocks, bonds, commodities, currencies, and cash. These
investments can be relatively illiquid and it may also be difficult to determine the current
value of the assets. From a customer segmentation perspective, AIs are predominantly
used by professional clients and much less by retail clients. We do not consider hedge
funds as AIs since they are strategies defined on liquid assets mostly.
4.5.1
Asset Class Transformation
But there is an ongoing transformation in the markets: What was deemed an AI yesterday
can become traditional asset class tomorrow. Investors in AIs often hope that these
investments show low correlations to classic investments in portfolios. Examples of AIs
include:
Private equity;
Real estate;
Insurance-linked securities;
Weather;
Distressed debt;
Economic and societal risk classes such as inflation, education, climatic change,
and demography.
4.5. ALTERNATIVE INVESTMENTS (AIS) - INSURANCE-LINKED INVESTMENTS 327

The transformation from AI classes to traditional asset classes is often described in
a transition from alpha to beta. The old time large alpha is reduced by the beta of the
CAPM, then new factors are added each of them reduces the remaining alpha.
Investors prefer to invest into a mixture of classic asset classes (Beta) and alpha from
AIs. Although such a combination may look good in back-tests or simulations; the risks
are difficult to understand and to manage. One reason for this is the illiquidity of AIs.
A blowup in illiquid assets can, in principle, channel to other asset classes triggering
heavy losses in - say - equities. Such events then counter act a main motivation for AI
investments - their independence from classic asset classes.
The chronology of the GFC provides an example. The risk factor in subprime mortgages is illiquid counter-party risk. Problems in this sector infected the more liquid products GNMA and FNMA, and structured finance products became illiquid - the prices
evaporated. These liquidity and creditworthiness problems channeled into the equity and
fixed-income markets putting both under heavy stress.
4.5.2
Insurance-Linked Investments
This section is based on LGT (2014). Insurance-linked investments are based on the
events of life insurers, and of non-life insurers such as insurers against natural catastrophes for example. The main types are insurance-linked securities (ILS) - such as
catastrophe bonds - and collateralized reinsurance investments (CRI). The size, in global
terms, of this relatively young market is USD 200 billion.
4.5.2.1
ILS
Insurance buyers such as primary insurers, reinsurers, governments, and corporates enter
into a contract with a special purpose vehicle (SPV). They pay a premium to the SPV
and receive insurance cover in return. The SPV finances the insurance cover with the
principal paid by the investors. The principal is returned at the end of the contract if
no event has occurred. The investor receives, in excess to the principal payback, the
premium and a collateral yield. This yield depends on the collateral structure.
An example is the catastrophe or CAT bond Muteki. Muteki SPV provided the
insurance buyer Munich Re with protection against Japanese earthquake losses. Central
to ILS investing is the description of the events. The description has to be transparent,
unambiguous, measurable, verifiable, and comprehensive. The parametrization in Muteki
is carried out using parameters from the 1,000 observatories located in Japan that use
seismographs. Ground acceleration is then used to calculate the value of the CAT
bond index, which then determines whether a payout from the investors to the insurance
protection buyers is due. The exposure of Munich Re in Japan is not uniformly spread
over the whole country. The insurer therefore weights the signals of the measuring
stations such that the payout in the CAT bond matches the potential losses of Munich
328
CHAPTER 4.
Re from claims incurred due to the event. Figure 4.10 shows the peak ground velocities
measured during the 11 March, 2011 earthquake. The star indicates the epicenter; the
regions with the highest ground velocities also experienced the related tsunami.
Figure 4.10: Ground velocities measured by the Japans 1,000 seismological observatories
during the earthquake of 11 March, 2011, which also caused a huge tsunami and almost
20,000 fatalities (Kyoshin [2011]).
The insurance industry lost an estimated USD 3035 billion. The ground acceleration
data became available on 25 March, 2015. Multiplying the ground velocity chart by the
weight-per-station chart of Munich Re implied an index level for the CAT bond of 1, 815
points. This index level led to a full payout from the investors to the insurance buyer
since the trigger level - that is to say, the level of the index at which a payout starts to
be positive - of 984 was exceeded and also because the exhaustion level of 1,420 points
was breached. Hence, investors in this CAT bond suffered a loss of their entire position.
4.5.2.2
CRI
In collateralized reinsurance investments (CRIs) the same insurance protection buyers as

for ILS buy insurance cover from an SPV in exchange for a premium. The SPV hands
over the premium and collateral yield to the investor. The investor pays, in cases where
he or she receives proof of loss, the loss payment to the SPV. Between the investor and
the insurance buyer a letter of credit is set up to guarantee the potential loss payment.
Table 4.11 summarizes ILS and CRI product specifications. The ILS pays out if an event
is realized and triggers are met. Then the bond pays out. For the CRI, if and event is
4.5. ALTERNATIVE INVESTMENTS (AIS) - INSURANCE-LINKED INVESTMENTS 329

Parameter
Wrapping
Return
Term
Size
Liquidity
Market size for non-life risk (2014)
ILS
Fixed-income security
Collateral yield + premium
12 to 60 months
USD 2 to 500 mn
Tradable asset; liquid
USD 24 bn
CRI
Customized contract
Collateral yield + premium
6 to 18 months
USD 2 to 50 mn
Non-tradable asset
USD 35 bn
Table 4.11: Comparison between ILS and CRI investments (LGT [2014]).
realized and triggers are met then the investor makes a loss payment.
ILS and CRI comprise 13 percent and 18 percent, respectively, of total reinsurance
investments. The remainders are traditional uncollateralized reinsurance investments.
The cumulative issuance volume of CAT bonds and ILS started in 1995, reached 20 bn in
2007, 40 bn in 2010 and 70 bn in 2015. The main intermediaries or service providers to the
catastrophe bond and insurance-linked securitization market in 2014 were Aon Benfield
Securities, Swiss Re Capital Markets, GC Securities, Goldman Sachs, and Deutsche Bank
Securities. Figure 4.11 shows the average catastrophe bond and ILS expected loss and
coupon by year.
Figure 4.11: Average expected coupon and average expected loss of CAT bonds and ILS
issuance by year (artemis.com [2015]).
The correlation with traditional asset classes is expected to be low (see Table 4.12).
330
CHAPTER 4.
Table 4.12) shows that such correlations are smaller than comparable correlations be-
ILS
Govt bonds
Corporate bonds
Equities
ILS
100%
8% 100%
25%
23%
Govt bonds
Corporate bonds
Equities
35%
-22%
100%
63%
100%
Table 4.12: Correlation matrix for different asset classes. Monthly data in USD from 31
Dec 2003 until 30 Nov 2014 (LGT [2014], Barclays Capital, Citigroup Index, Bloomberg).
tween bonds and stocks. Nevertheless, correlation is weakly positive. This is due to the
fact that catastrophe events always have an impact on firm value in both directions. The
correlation with government bonds is much less affected and would become stronger if a
catastrophe event had a significant impact on the entire wealth of a nation.
Regulation plays a significant role in the use of alternatives such as CAT bonds and
in CRIs. The creditworthiness of the insurance and reinsurance company is reflected in
the calculated amount of regulatory capital. This amounts to large capital charges for
the catastrophe cases. To reduce the capital charge under Solvency II, the catastrophe
part of the risks is transferred to the capital markets using ILS and CRI. Fully collateralizing these transactions further reduces the regulatory capital charges. These alternative
instruments only pay out after a significant erosion of the insurance buyers own capital.
4.6
4.6.1
Hedge Funds
What is a hedge fund (HF)?
HFs allow as ordinary investment fonds for collective investments. But many HFs cannot
be offered to the public - that is to say, private placement with qualified investors often
defines the client and the distribution channel.
From a regulatory and tax perspective, HFs are ofte offshore domiciliated on certain
islands or in countries that offer such funds tax advantages or that have relatively relaxed
regulation standards.
But regulation of hedge funds is changing. Large HFs in the USmust register with
the Securities and Exchange Commission (SEC). Since 2012, HFs with assets exceeding
USD 150 million have to register and report information to the SEC but not to other
parties. Fatca, the Foreign Account Tax Compliance Act, is a US extraterritorial regime
of hedge fund regulation. It requires all non-US hedge funds to report information on
their US clients. Europes Alternative Investment Fund Managers Directive (AIFMD)
requires information by any fund manager independent where they are based if they sell
to an EU-based investor. Summarizing:
4.6. HEDGE FUNDS
331
HFs often have a limited number of rich investors; for some exceptions see the next
section. If an HF restricts the number of investors, then it is not a registered investment company, since it is - in the US - exempt from most parts of the Investment
Company Act Of 1940 (the 40-Act). Most HFs in the US have a limited-partnership
structure. The limitation of the number of investors automatically increases the
minimum investment amount to USD 1 million or more.
HFs often use short positions, derivatives, and leverage in their strategies.
Many HFs do not allow investors to redeem their money immediately, which would
be the case if the 40-Act would apply. The reason that HFs do not want immediate
redemption is the short positions of the funds. A short position means that someone
is exposed to the counterparty risk of the HF. To reduce this risk, an HF needs to
pay margins. If short positions increase, HFs need to add more and more margin
and would then eventually face problems if at the same time investors redeemed
their money.
Fs have to satisfy less stringent disclosure rules than do mutual funds.
Mutual funds are not allowed to earn non-linear fees, while most HFs do by charging
a flat management fee and a performance fee.
From an economic point of view, HFs are an investment strategy and not an asset
class in their own right since they often trade in the same liquid asset classes with an
HF-specific strategy.
HFs can face losses due to their construction or the market structure even in cases
when there are no specific market events. As Khandani and Lo (2007) state, quantitative (quant) HFs - investment rules are based on algorithms that try to identify market
signals - faced a perfect financial storm in August 2007. Although there were no market
disruptions at this time, some quant HFs faced heavy losses. The Global Alpha Fund,
managed by Goldman Sachs Asset Management, lost 30 percent in a few days. This was
shocking for the industry since these funds claimed to be designed for low volatility and
the different strategies in HFs were supposed to have low correlations with each other.
Suddenly, after the losses, the returns bounced back. But the gains did not make up
the losses due to the reduced leverages in the loss period. The Goldman Global Equity
Opportunities Fund, for example, received an injection of USD 3 billion to stabilize it.
Despite low volatility and their low-correlation construction, how could it be then
that during calm markets many quantitative HFs faced heavy, sudden losses? Several
reasons are discussed in the literature including the high correlations of strategies on
the downside, too many players in the HF sector doing the same thing, certain common
factors underlying the seemingly different strategies (claims of low correlation were incorrect), and the use of the same risk models. We use several references in this section
but the two main sources are the hedge fund review of Getmansky, Lee, and Lo (2015)
and Ang (2013).
332
4.6.2
CHAPTER 4.
Hedge Fund Industry
The first HF was set up by Jones in 1949. This fund was based on three principles.
First, it was not transparent how Jones was managing the fund. Second, there was a
management incentive fee of 20 percent, but no management fee. Third, the fund was
set up as a non-public fund. This framework is still applied by many HFs today.
The largest HF in 2014 was Bridgewater Associates with USD 87 billion assets under
management, followed by J.P. Morgan Asset Management (see the Appendix for further
details). Its size in 2014 was USD 2.85 trillion versus USD 2.6 trillion in 2013. Figure 19
shows the evolution of AuM in the hedge fund industry. The average growth in HF assets
from 1990 to 2012 was roughly 14 percent per year. The decrease in AuM after the GFC
was fully recovered six years later. The losses incurred during the GFC were around 19
percent, which is only around half the losses of some major stock market indices. But
investors left the HF sector in this period, coming back to invest in HFs in 2009 and the
following years. Unfortunately, in the years 2009 to 2012, HF performance was lower
than the S&P 500, ranging between 4.8 percent and 9.8 percent on an annual basis.
Figure 4.12: Hedge fund assets under management in USD billions (Barclays Hedge Fund
Database).
The decreases in AuM during the GFC and the European debt crisis show that
investors allocate money pro-cyclically to HFs, similar to the actions of investors in
mutual funds or ETFs. We note certain facts regarding the largest HFs, following Milnes
(2014) (the number after the hedge funds name is its ranking in the list of the worlds
largest HFs as of 2014).
4.6. HEDGE FUNDS
333
Bridgewater Associates (1). There was a relatively poor performance of the three
flagship funds in 2012 and 2013 of 3.5%, 5.25%, and 4.62%. The performance over
ten years is 8.6%, 11.8%, and 7.7%.
J.P. Morgan Asset Management (2). J.P. Morgan bought 2004 the global multistrategy firm Highbridge Capital Management for USD 1.3 billion. Highbridges
assets have 2004 multiplied by nearly 400 percent to USD 29 billion.
Brevan Howard Capital Management (3). This HF maintains both solid returns
and asset growth - which is the exception of a HF. The flagship is a global macrofocused HF (USD 27 bn AuM), which - since its launch in 2003 - has never lost
money on an annual basis.
Och-Ziff Capital Management (4) offers publicly traded hedge funds in the US with
far greater disclosure than other HFs. Its popularity is mainly due to Daniel Ochs
conservative investing style.
BlueCrest Capital (5) was a spin-off from a derivative trading desk at J.P. Morgan
in 2000. It has grown rapidly and is one of the biggest algo hedge fund firms. Its
reputation boosted up in 2008 when it made large profits while most other HF
facing losses. Its trend of explosive asset growth continues.
AQR Capital Management (7), co-founded by Cliff Asness, gives retail investors
access to hedge fund strategies. Asness is also well-known for his critique of the
unnecessarily high fees charged by most HFs and his scientific contributions.
Man Group (9) was founded in 1783 by James Man as a barrel-making firm. It has
225 years of trading experience and 25 years in the HF industry. In recent years,
its flagship fund AHL struggled due to its performance.
Baupost Group (11) is an unconventional, successful HF. Baupost avoids leverage,
is biased toward long trades, holds an average of a third of its portfolio in cash and
charges only 1 percent fee.
Winton Capital Management (13) has its roots in the quant fund AHL (founded
1987 and bought by Man Group in 1989). David Harding, like many in the quantitative trading field with a math or physics education was also a pioneer in the
commodity trading adviser (CTA) field. Winton is the biggest managed futures
firm in the world.
Renaissance Technologies (15). The mathematician (co-author of the Chern - Simons differential topology theory) Jim Simons is one of the most trusted hedge fund
managers in the world today, with USD 22 billion assets under management. After
an outstanding academic career as a mathematician, Simons became the pioneer
of quantitative analysis in the hedge fund industry. Renaissance mainly relies on
scientists and mathematicians to write its moneymaking algorithms. It has been
consistently successful over the years.
334
CHAPTER 4.
The largest loss a HF has suffered was the USD 6 billion losses of Amaranth in 2006.
This loss, of around 65 percent of the funds assets, was possible due to extensive leverage
and a wrongheaded bet on natural gas futures. Investors who wanted to pull their money
out were not allowed to do so since the fund imposed gates (see section on biases below).
The business of running a hedge fund has become more expensive due to the increased
regulatory burden. In the results of a recent survey, KPMG (2013) outline figures for
the average set-up costs: USD 700, 000 for a small fund manager, USD 6 million for a
medium-sized one, and USD 14 million for the largest. In all, KPMG estimated hedge
funds had spent USD 3 billion meeting compliance costs associated with new regulation
since 2008 - equating to, roughly, a 10 per cent increase in their annual operating costs.
KPMG (2013).
4.6.2.1
HF Strategies
An important selling argument for HFs is that their investment only weakly correlates
with traditional markets. Since HFs do invest in traditional markets, it is not clear that
this marketing argument holds true at any time. In fact, the argument is true in some
periods while it fails to hold in others. Starting in 2000, correlation between MSCI World
and the broad DJ CS Hedge Fund Index (HF Index) changed on a two-year rolling basis:
Correlation was 0.16 (HF index) in the years 2000-2007 and jumped to 0.8 in 2007-2009
since a significant number of HFs managers started, in 2007, to invest traditionally in
stocks and commodities.
Contrary to mutual funds, HFs use extensively short selling and leverage strategies.
We describe roughly some HF strategies:
Long-short strategies.
Relative value or arbitrage strategies use mis-pricings between securities.
Event strategies focus on particular events that can affect specific firms, sectors, or
whole markets. T
Global macro strategies try to identify global economic trends and to replicate them
using financial products. An example is the HF Quantum of George Soros. This
HF noted, in 1992, the overvaluation of the British pound. Using huge amounts
of capital, the fund forced the Bank of England to stop maintaining the value of
the pound - the currency strongly depreciated against other leading currencies, the
fund made large gains, and the UK was forced to leave the European Currency
Unit (ECU), the predecessor of the European Monetary Union (EMU).
Many HF are similar to strategies used in factor investing. The main difference is transparency of the latter one, implementation of the factors as indizes and construction of
a cross-asset offering of factors. This main advantages make it attractive for investor to
4.6. HEDGE FUNDS
335
switch their investments from the more opaque and often more expansive HF to a factor
portfolio.
We discuss features of the investment strategies of HF in the next sections. As a
concrete example we illustrate the findings for the so-called CTA strategies which we
introduce next.
4.6.3
CTA Strategy
CTA strategies which are managed futures strategies are HF strategies where the HF
invests in highly liquid, transparent, exchange-traded futures markets and foreign exchange markets. The abbreviation CTA means Commodity Trading Advisors which are
heavily regulated in the US by NFA / CFTC. Typically traded instrument are: futures
(and options) on equities, equities indices, commodities, fixed income. FX spot, forwards, futures and FX options. Investments are made in different markets following a
rule based investment strategy. The predominant investment strategy is trend following:
The strategy is not directional and hence investors can gain and lose both from rising
and falling markets. The strategies are typically fully price driven and rule based: There
is no need for any fundamental input nor for a forward looking market opinion. The
portfolio construction is usually risk-weighted. Figure 4.13 shows the size evolution of
the managed futures industry.
Figure 4.13: Development of the managed futures industry. Data are from Barclay CTA
index (Gmr [2015]).
The figure shows the strong inflow in 2009 after the GFC where managed futures
336
CHAPTER 4.
where in particular largely successful and other investments in HF faced heavy losses.
The last 4 years show stagnation in the growth of AuM. This is due to the many events in
the recent past which make trend following difficult: Euro Sovereign Debt Crisis, Greece,
China Crisis 2015, etc. Many of these crisis led to sharp corrections in the markets with
a strong rebound following the downturn closely - such a zig-zag behaviour is the natural
enemy for trend models since the risk is that trend revering signals are always too late.
The largest player as of September 2014 with around USD 30 bn is Wynton Capital,
followed by MAN HL and Two Sigma Investments. Geographically, the largest amount
of assets are in the London area, followed from the US and Switzerland. In the last
two decades there has been a significant shift from the US to London and to some other
European Countries.
4.6.4
Fees
Most hedge funds charge annual fees consisting of two components: a fixed percentage
of assets under management (typically 12 percent of the NAV per year) and an incentive
fee that is a percentage (typically 20 percent) of the funds annual net profits, often defined as the funds total earnings above and beyond some minimum threshold such as the
LIBOR return and net of previous cumulative losses (high-water mark). The incentive
fee should align the interests of the manager with those of the investor in every year of
the funds existence. We note that portfolio managers of mutual funds, exchange traded
funds (ETFs), and pension funds typically do not earn incentive fees.
HF managers defend their performance fee by stating that they can generate alpha in
a unique way and if they succeed, they are willing to share the benefits with the investors.
Is this justified for all HF managers? Titman and Tiu (2011) document that on average
HF in the lowest R2 quartile charge 12 basis points more in management fees and 385
basis points more in incentive fees compared to hedge funds in the highest quartile. Feng,
Getmansky, and Kapadia (2013) find that management fees act similar as a call option
at maturity, and that HF managers can therefore increase the value of this option by
increasing the volatility of their investments.
CTAs also have the two fee components. But one observes that very professional investors in CTAs prefer to set the fixed management fee to zero and instead to share even
more than 20% of the performance fee. Fees of the CTA industry are under pressure - the
old 2/20 (2% management fee and 20% participation rate) are for most CTA managers
tempi passati. One reason is the unbroken influx of new CTA managers and second, the
general pressure on fees of the HF sector affects also the CTAs.
Fees are particulary opaque for funds of funds, see Brown et al. (2004). They find
that individual funds - single layer fees - dominate funds of funds - double layer fees - in
terms of net-of-fee returns and Sharpe ratios. The possible impact of non-linear fees on
the compensation of HF managers or owners is shown in Figure 4.14. It compares the
4.6. HEDGE FUNDS
337
compensation of top-earning hedge fund managers with the compensation of top-earning

CEOs. Broadly there is a factor of between 10 and 30 between the respective salaries.
Figure 4.14: Data from Alpha Magazine (2011) for the HFs and from Forbes (2011) for
the CEOs.
The fee discussion continues to damage the reputation of HF.. California Public
Employees Retirement System (CalPERS) decided 2014 to divest itself of its entire USD
4 billion portfolio of HF. Reasons where the high costs and the complexity associated
with its holdings in 24 hedge funds and six so-called funds of funds.
4.6.5
Leverage
Hedge funds often use leverage to boost returns. Since leverage increases both the returns and risks, leverage is most relevant for low volatility strategies, else unacceptable
levels of risk follow. Besides return volatility, illiquidity risk is another risk source for
leveraged investments
Since leverage financing means using credit, margin calls apply. This can force HFs
to shut down in a crisis when the HF is unable to cover the large margin calls. Ang et
al. (2011) conclude that hedge fund leverage decreased prior to the start of the financial
crisis in 2007 and was at its lowest in early 2009 when the leverage of investment banks
was at its highest. Hence, leverage is not constant over time. Cao et al. (2013) find that
HF are able to adjust their portfolios market exposure as a function of market liquidity
conditions.
338
CHAPTER 4.
A common pitfall arises if one considers the use of futures in investment strategy
such as for CTAs. Suppose that an investor invests USD 100 in S&P500 but desires an
exposure of USD 200 in the index. Using futures, the risk management is done on the
margins. Suppose that USD 10 are needed for the futures contract where we do not
specify different types of margins. Then the leveraged position requires a margin of USD
20. How much can the investor lose? In the worst case USD 100 when there is a margin
call which exceeds USD 80. If the investor cannot comply with the margin call, if the
investor is not able to pay the called amount, the positions are simply closed and the loss
of the investor is the USD 100. Summarizing, the leverage acts on the margining process
which itself is a fraction of total cash.
4.6.6
Share Restrictions
There are following Getmansky et al. [2015] various restriction for investors to withdraw
money from a hedge fund:
a subscription process for investors,
the capacity constraints of a given strategy,
new investors are often forced into a one-year lockup period during which they
cannot withdraw their funds,
withdrawals that are subject to advanced notice, and
temporary restrictions on how much of an investors capital can be redeemed in a
crisis.
Such restrictions such protect against fire-sale liquidations causing extreme losses for
the funds remaining investors. The discretionary right to impose withdraw gates can be
very costly for investors if the losses accumulate during the period where withdrawing is
not possible, see Ang and Bollen (2010).
4.6.7
Fund Flows and Capital Formation
Several studies document a positive empirical relationship between fund flows and recent performance. This suggests that HF investors seek positive returns and flee from
negative returns (Goetzmann et al. [2003], Baquero and Verbeek [2009], and Getmansky
et al. [2015]). But the relationship between fund flows and investment performance is
often nonlinear. Hence, empirical studies can be ill-specified to account for these effects.
Aragon, Liang, and Park (2013), Goetzmann et al. (2003), Baquero and Verbeek (2009),
Teo (2011) and Aragon and Qian (2010) report about some non-linear relations.
4.6. HEDGE FUNDS
4.6.8
339
Biases
Hedge fund managers report, voluntarily, their returns to any given database. They are
free to stop reporting at any time. Therefore, a number of biases are possible in HF
returns databases.
Survivor bias: Funds that close are not in database. Funds are more likely to close
if they have bad returns. This means, that funds delist from a database if they
have to close their funds or because of poor performance. This bias increases the
average funds return, ranging between 0.16% 3%, see Ackermann et al. [1999],
Liang [2000] and Amin and Kat [2003]) for the studies.
The selection bias means that there is a stronger reporting incentive if returns are
positive.
Backfill bias. The primary motivation for disclosing return data is marketing.
Therefore, HF start to report after they have been successful, they can then fill in
their positive past returns; the backfill bias. Note that funds which lose money
during the backfill period and are not get included in the database. Fung and Hsieh
(2000) estimate a backfill bias of 1.4 percent p.a. for the Lipper TASS database
(1994-1998). Malkiel and Saha (2005) estimate that the return of HFs that backfill
is twice the return figure for those not backfilling. The size of the backfill bias is 7
percent in their study. This shows that different studies applied to different time
periods or different definitions of the variables or a different basis of HFs are likely
to produce different results.
Incubator bias. Fund families start incubator funds, and then only open the ones
that do well. They then report the entire history. Its amazing that the SEC
lets them do this. This bias remains in the CRSP database.
Backfilling and extinction bias mean that part of the left tail loss return distribution are
missing in HF databases. Large, well-known HFs do not need to engage in marketing by
reporting to commercial databases. Hence, part of the right-hand return tail is missing
in the databases. Edelman et al. (2013) compare non-reporting well-known hedge funds
to large funds reporting to databases. They find that an index of large, reporting firms
is a reasonable proxy for the performance of non-reporting ones. We recall the findings
of Patton et al. (2013) in Section 2.5.4 about the revision of previously reported returns.
Given these biases two questions are immediately relevant: Why do databases not
correct in a transparent and standardized form for these biases when publishing their
data? Figure 4.15 from Getmansky et al. (2015) shows the impact if one corrects for
survivorship and backfill biases. Given the many biases and the high fee structure,
why is regulation for HF financial intermediaries much less severe than for banks, asset
management firms, or insurance companies?
Using these corrections turns an annual mean return of 12.6 percent into half of its
value of 6.3.
340
CHAPTER 4.
Figure 4.15: Summary statistics for cross-sectionally averaged returns from the Lipper
TASS database with no bias adjustments, adjusted for survivorship bias, adjusted for
backfill bias, and adjusted for both biases during the sample period from January 1996
through December 2014. The last value - box p-value - represents the p-value of the
Ljung-Box Q-statistics with three reported legs (Getmansky et al. [2015]).
4.6.9
Entries and Exits
More than twice as many new funds entered Jan 1996-Dec 2006 the Lipper TASS database
each year, despite the high attrition rates. This process reversed in the last years where
the number of exits exceeded the number of entries.
After the peak number of HF in 2007 - 2008, the attrition rate jumped to 21 percent,
the average return was the lowest of any year (18.4 percent), and 71 percent of all
hedge funds experienced negative performance. The number of hedge funds reporting to
the TASS database after the GFC declined. This industry-wide view does not hold for
the different segments or styles of the HF industry.
The survival rates of hedge funds have is estimated by several authors, see Horst and
Verbeek (2007) for example. Summarizing, 30 50 percent of all HFs disappear within
30 months of entry and 5 percent of all HFs last more than 10 years. These rates differ
significantly for different stlyes ranging from 5.2 14.4%, Getmansky et al. (2004).
CTAs are not different qualitatively from the above facts: A significant number of
funds do not survive the first 5 years.
4.6. HEDGE FUNDS
4.6.10
341
Investment Performance
To discuss the investment performanc, we use the popular categorization of the Lipper
TASS database contains 11 main groupings: Convertible Arbitrage, Dedicated Short Bias,
Emerging Markets, Equity Market Neutral, Event Driven, Fixed Income Arbitrage, Global
Macro, Long/Short Equity Hedge, Managed Futures, Multi-Strategy, and Fund of Funds.
4.6.10.1
Basic Performance Studies
There are several facts that limit the alpha of the HF industry. First, the number of HF
managers has increased from hundreds to more than 10, 000 in the last two decades. Although the average fund manager today has higher technical skills than, say, 20 years ago,
it is becoming increasingly difficult for the individual manager to beat the HF market.
As a HF manager states: Take out the superstars, and you are left with an expensive,
below-benchmark industry. A second limitation is the increased efficiency of some markets. The greater the extents to which markets satisfy the EMH, the less possible it
is to predict future returns. A third factor is the relationship between fund size and
performance: An increasing size of the fund typically lead to a weaker performance.
Asness (2014) plots the realized alpha of hedge funds over a period of 36 months. He
takes the monthly returns over cash, subtracts 37 percent for the S&P 500 excess return
- which is the full-period, long-term beta - and looks at the annualized average of this
realized alpha (see Figure 4.16).
We observe the decreasing alpha over time which ends up negative in the near past.
Recent years seem to have been particular. Unlike for mutual funds, a number of studies
document positive risk-adjusted returns in the HF industry before the GFC. Ibbotson et
al. (2011) report positive alphas in every year 1995-2009.
While the alphas of the HF industry have been decreasing steadily in the last two
decades, correlation with broad stock market indices shows the opposite evolution.
The performance of HF is often linked to several characteristics such as experience
or incentives. Gao and Huang (2014) for example report that hedge fund managers gain
an informational advantage in securities trading through their connections with political
lobbyists. They find that politically connected hedge funds outperform non-connected
funds by between 1.6 percent and 2.5 percent per month on their holdings of politically
sensitive stocks as compared to their less politically sensitive holdings.
4.6.10.2
Performance Persistence
There is mixed evidence regarding performance persistence.

Agarwal and Naik (2000a), Chen (2007) and Bares et al. (2003) find performance
persistence for short periods.
342
CHAPTER 4.
Figure 4.16: Average monthly returns (realized alpha) of the overall Credit Suisse Hedge
Fund Index and the HFRI Fund Weighted Composite Index for a rolling 36 months
(Asness [2014]).
Brown et al. (1999) and Edwards and Caglayan (2001) find no evidence of performance persistence.
Fung et al. (2008) find a positive alpha-path dependency. Given a fund has a
positive alpha, the probability that the fund will again show a positive alpha in
the next period is 28 percent. The probability for non-alpha fund is only half of
this value. The year-by-year alpha-transition probability for a positive-alpha fund
is always higher than that of a non-alpha fund.
Persistence in hedge fund performance challenges the no-persistence equilibrium result of the Berk and Green (2004) model for mutual funds. While performance persistence
is sought out by investors, excessive persistence is a signal that something is wrong.
Figure 4.17 shows the extremely smooth return profile of Fairfield Sentry compared to
the S&P 500. Fairfield Sentry was the feeder fund to Madoff Investment Securities. For
CTAs the following performance and performance persistence holds, see Figure 24. On
the upper panel the performance of Winton Capital Management is shown and Chesapeake Capital is in the lower panel.
We consider the performance of the CTAs Winton and Chesapeake. Starting with
USD 1, 000 of investment in October 1997 until the January 2013 (Quantica [2015]), the
first CTA pays out around USD 9, 000 ad the end of 2013 and the second one USD 18, 000.
4.6. HEDGE FUNDS
343
Figure 4.17: Monthly return distribution for Fairfield Sentry (line) and S&P 500 (dots)
returns (Ang [2013]).
Both CTAs had positive return until the GFC. Then the Chesapeakes volatility started
to increase and the positive past trend became essentially a flat one. This behaviour
is typical for other CTAs too. For Winton, there is almost no suffering of return during and after the GFC. The reason is risk. Winton takes much less risk than Chesapeake.
Why does a CTA strategy can work? Empirical evidence for coexistence of skew
and variance risk premiums (persistent expected excess returns) in the equity index
market exist. The skewness and the Sharpe-ratio are highly positively related in equity
markets: Investors are compensated with excess returns for assuming excess skewness
rather than excess volatility. Hence, there is a positive relation between risk premia
and skewness. The exceptions are trend-following strategies which offer positive riskpremia with positive skewed returns! Market participants often belief that hedge funds
are excessively using short strategies. But this is for CTAs for example not the case around 80% of the investments are long strategies and 20% are short strategies. As an
example, consider the Quantica CTA.
Figure 4.18 shows the attribution of the profit and loss to the different asset classes
in the last decade. During the GFC the CTA did not produced a positive return by
huge short positions in equity markets but by long positions in the trend model for fixed
income: The decreasing rates in this period where a constant source of positive returns.
344
CHAPTER 4.
Figure 4.18: Annual sector attribution of the profit and loss for the Quantica CTA
(Quantica [2015]).
4.6.10.3
Timing Ability
Hedge funds are much less restricted compared to mutual funds to engage in several
forms of timing. This includes market timing, volatility timing, or liquidity timing. Chen
(2007) and Henriksson and Merton (1981) both find significant market-timing ability for
different HF styles.
The study of Aragon and Martin (2012) gives evidence that HF successfully use
derivatives to profit from private information about stock fundamentals. Chen (2011)
finds that 71 percent of hedge funds trade derivatives. Cao et al. (2013) find strong
evidence for the liquidity-timing ability of HF. They conclude that HF managers increase
(decrease) their portfolios market exposure when equity market liquidity is high (low),
and that liquidity timing is most pronounced when market liquidity is very low.
4.6.10.4
Luck and Skill
Criton and Scaillet (2014) apply the false discovery methodology to hedge funds. This
means the FDR study for mutual funds of Section 4.3.4.3 is repeated for HF. We recall
that two questions are open from the mutual fund discussion. Do the skills in mutual
funds decline to almost zero over time due to skillful mutual fund asset managers moving
to the HF industry and/or do the markets have become more efficient over time? They
use a multi-factor model with time-varying alphas and betas. This means that they
consider different risk factors for the different asset classes. For equity, the one risk
4.6. HEDGE FUNDS
345
factor is the S&P500 minus the risk-free rate and for bonds one factor is represented by
the monthly change in the 10-year treasury constant maturity yield,
The authors consider equity long/short strategy, emerging markets, equity market
neutral, event driven, and global macro strategies. The main results are that the majority
of funds are still zero-alpha funds (ranging from 41 percent to 97 percent for different
strategies) as for mutual funds. Second, there is a higher proportion of positive alpha
funds compared to mutual funds (045%). Third, the proportion of negative-alpha funds
ranges between 2.5 percent and 18.6 percent and finally, the highest skilled funds are
emerging market strategies, followed by global macro and equity long/short. Forth, the
proportion of skilled or unskilled funds is different for different market stress periods such
as LTCM crisis, dot.com bubble burst and the GFC. But there is not uniform decline of
skilled funds observed over the period from 1992 to 2006 as for mutual funds. Therefore,
there is some evidence that successful mutual fund asset managers moved to the HF but
this evidence cannot be supported by a strict empirical test.
4.6.10.5
Hedge Fund Styles
Hedge fund styles are highly dynamic and behave very differently from those used by
mutual funds. Getmansky et al. (2015), see Figure 4.19, report correlations of monthly
average returns of hedge funds in each Lipper TASS style category.
High correlation. Correlations between Event Driven and Convertible Arbitrage
categories are 0.77 .
Negative correlation. Correlations between Long/Short Equity Hedge and Dedicated Short Bias are 0.74.
Virtually no correlation. Managed Futures have no correlation with other categories
except for Global Macro.
Getmansky et al. (2015) use a factor model based on principal component analysis
(PCA) to gain more insight into possible correlations. The size of the eigenvalues indicates that 79 percent of the strategies volatility-equalized variances is explained by only
three factors. This suggests that a large fraction of hedge funds returns are generated by
a very small universe of uncorrelated strategies. The largest estimated eigenvalue takes
the value 52.3 percent.
The authors simulate one million correlation matrices using IID Gaussian returns
and they compute the matrices largest eigenvalues. The mean of this distribution is
13.51 percent, while the minimum and maximum are 11.59 percent and 17.18 percent,
respectively. These values are much smaller than 52.3 percent. This is strong evidence
that the different HF, although they are claimed to be different in their styles and even
unique, in fact their returns are driven by few common factors.
Since 80 percent of HF category returns are driven by three factors, the benefits of
diversification are limited for HF. The above statements remain qualitatively unchanged
346
CHAPTER 4.
Figure 4.19: Monthly correlations of the average returns of funds for the 10 main Lipper
TASS hedge fund categories in the Lipper TASS database from January 1996 through
December 2014. Correlations are color-coded with the highest correlations in blue, intermediate correlations in yellow, and the lowest correlations in red (Getmansky et al.
[2015]).
if the Gaussian distribution is replaced by a more realistic fat-tailed one see Getmansky
et al. [2015].
The heterogeneity and commonality among HF styles is presented in Figure 4.20.
It follows, that Dedicated Short Bias underperformed all other categories which is of
no surprise due to the good performance of equity in that period. Furthermore, MultiStrategy hedge funds outperformed Funds of Funds, Managed Futures funds returns
appear roughly IID and Gaussian but the returns of the average Convertible Arbitrage
fund are auto-correlated and have fat tails. The styles Long/Short Equity, Event Driven,
and Emerging Markets funds have high correlations with the S&P 500 total return index
between 0.64 0.74.
Return volatility the average Emerging Markets fund is three times greater than that
of the average Fixed Income Arbitrage fund. But low volatility is not synonymous to low
risk. Risk, measured with the maximum drawdown measure, is for example for Managed
Futures low although volatility is high. Contrary, Fixed Income Arbitrage has only a low
volatility but large drawdowns.
4.6. HEDGE FUNDS
347
Figure 4.20: Summary statistics for the returns of the average fund in each Lipper TASS
style category and summary statistics for the corresponding CS-DJ Hedge Fund Index
from January 1996 through December 2014. Sharpe and Sortino ratios are adjusted for
the three-month US treasury bill rate. The All Single Manager Funds category includes
the funds in all 10 main Lipper TASS categories and any other single-manager funds
present in the database (relatively few) while excluding funds of funds (Getmansky et al.
[2015]).
When investing in auto-correlated returns, investors consider the increased likelihood

that an analysis based on the returns volatility will understate the actual downside risk.
Ang (2013) confirms that many HF styles show a strong correlation (0.4 or higher) to
equity and to volatility. This exposure to volatility means that HFs are selling volatility
- they are short put options that have a strike (deep) out of the money. But this means
that in normal times they collect a premium, the put price or premium, and in times of
stress they face huge losses.
The CTA Quantica under consideration shows a low correlation with the traditional
asset classes inclusive the global hedge fund index: betweeen 10% and 15% correlations
to the S%P 500, USD Gov Bonds 3-5 and GSCI commodity index. Correlation to the
HFRX Global Hedge Fund Index is 24% and 68% to the Newedge CTA index. The
large correlation with the CTA index indicates that many CTA are using similar models
- trend-following models which are broadly diversified. Furthermore, that managed futures are low correlated to traditional asset classes - equity correlation is 0.1 and bond
correlation is 0.2 for monthly returns.
348
CHAPTER 4.
We consider drawdown risk in CTAs. Although CTAs show a persistent upwards drift
in the long run (see Figure 4.21), they may well suffer from temporary heavy losses. The
impact of such losses on the CTA manager and the CTA investor are completely different.
While for the manager such losses are normal and will be compensated by positive future
returns - this is due to the belief of the manager in CTAs -, for the investor such heavy
losses can lead to an exit of the investment if they appear at a bad moment of the
investment. Figure 4.21 shows the drawdown periods for different investments.
Figure 4.21: Drawdown periods for S&P500 total return, GS commodity total return
index and Barclays US Managed Futures index BTOP 50. Data are from Dec 1986 to
Mar 2013 (Bloomberg).
It follows that the CTA index shows much less heavy drawdowns than the equity and
the commodity index. The main reason for this fact is discipline in investment. This has
two comonents. First, CTAs are mostly fully rule based. If a stop-loss trigger is breached
then the loss is realized. Second, CTAs allocations are risk-based where again, the risk
attribution is carried out mechanically. CTAs therefore follow the investment advice of
David Ricardo which he wrote in The Great Metropolis 1838: Cut short your losses,
and let your profits run on.
4.7
Event-Driven Investment Opportunities
The models so far have assumed that one has time to elicit an investors preferences and
to search for an appropriate and suitable investment strategy.
4.7. EVENT-DRIVEN INVESTMENT OPPORTUNITIES
349
This section is an (almost) verbatim transcription of Mahringer et al. (2015).

This section considers a different setup: markets are disrupted unpredictably by certain events. There are different causes for these events - macroeconomic, policy interventions, break down of investment strategies, or firm-specific events (such as, for example,
Lehman Brothers). While some events are isolated and affect only a single corporate,
events at the political or market level often lead to more interesting investment opportunities for structured products. Policy interventions can trigger market reactions that in
turn can lead to new policy interventions. The Swiss National Banks announcement, in
January 2015, that it would remove the euro cap and introduce negative interest rates
had an effect on Swiss stock markets, EUR/CHF rates, and fixed-income markets.
Such events can impact different financial markets for a short period of time (a flash
crash), a medium time period (the GFC), or a long time (the Japanese real-estate shock
of the 1990s). For investors, gaining an investment view and evaluating such events personally is easier if an event has happened and markets are under stress than it is in normal
times. Once an event has occurred, an investor no longer needs to guess whether any
event could happen in the future that would affect the investment. However, an investor
does have to consider the possibility that markets will return to the pre-event state or
to a new state that will become the new normal, or if the changes in market values are
just a beginning. Analyzing these possibilities is not a simple task, but it is simpler than
the situation in normal markets, where the likelihood of the occurrence of events has to
be considered. It should be stressed that a general requirement for investments based on
events is the fitness of all parties involved - investors, advisory, and the issuer. In order
to benefit from such investments, the active involvement of all parties is necessary.
If an event occurs, the time-to-market to generate investment solutions and the timeto-market for investors to make an investment decision are central. If either of these is
too long, one misses the investment opportunity.
4.7.1
Structured Products
The wrappings of such solutions are no longer funds or ETFs - it takes too long to
construct them. The wrappers used are derivatives and structured products. Both are
manufactured by trading units or derivative firms - that is to say, not by traditional asset
management firms. Table 4.13 compares mutual funds with structured products.
This section is a slightly enlarged version of Mahringer et al. (2015).
4.7.2
Political Events: Swiss National Bank (SNB) and ECB
The SNB announced, on 15 January 2015, the removal of the euro cap and the introduction of negative CHF short-term interest rates. This decision caused the SMI to lose
about 15 percent of its value within 1 - 2 days, and the FX rate EUR/CHF dropped
from 1.2 to near parity. Similar changes occurred for USD/CHF. Swiss stocks from
350
CHAPTER 4.
Mutual funds
Mass products
No issuer risk
Long time-to-market
Performance promise
Large setup costs
Liquid and illiquid assets
Strong legal setup, standards, market access
Structured Products
Taylor made, starting from CHF 20000
Issuer risk (but COSI, TCM)
Short time-to-market
Payment promise
Low setup costs
Liquid assets
No legally binding definition of Structured Products
High-quality secondary markets
On balance sheet
Table 4.13: Mutual funds vs structured products.

export-oriented companies or companies with a high cost base in Swiss francs were most
affected. The drop in stock prices led to a sudden and large increase in Swiss stock
market volatility. Swiss interest rates became negative for maturities of up to thirteen
years.
It was also known at the time that the ECB would make public its stance on quantitative easing (QE) one week later. The market participants consensus was that Mario
Draghi - president of the ECB - would announce a QE program. The events in Switzerland, which came as a surprise, and the ECB QE measures subsequently announced
paved the way for the following investment opportunities:
1. A Swiss investor could invest in high quality or high dividend paying EUR shares at
a discount of 15 percent. EUR shares were expected to rise due to the forthcoming
ECB announcement.
2. All Swiss stocks, independent of their market capitalization, faced heavy losses
independently of their exposure to the Swiss franc.
3. The increase in volatility made BRCs with very low barriers feasible.
4. The strengthening of the Swiss franc versus the US dollar, and the negative CHF
interest rates, led to a USD/CHF FX swap opportunity that only qualified investors
could benefit from.
5. The negative interest rates in CHF and rates of almost zero in the eurozone made
investments in newly issued bonds very unattractive. Conversely, the low credit risk
of corporates brought about by the ECBs decision offered opportunities to invest
in the credit risk premia of large European corporates via structured products.
Before certain investment opportunities are discussed in more detail, it should be
noted that by the time this paper had been written (about five months after the events
described above took place), all investments were profitable and some even had twodigit returns. This certainly does not mean that the investments were risk free, as such
351
investments are not risk free. But it shows that many investment opportunities are
created by policy interventions. This contrasts with the often voiced complaints about
negative interest rates and the absence of investment opportunities for firms, pension
funds, and even private investors. Some investment ideas will now be considered in more
detail.
4.7.3
Opportunities to Invest in High Dividend Paying EU Stocks
The idea was to buy such stocks at a discount due to the gain in value of the Swiss
franc against the euro. The first issuer of a tracker offered such products on Monday, 19
January 2015 - that is to say, two business days after the SNBs decision was announced.
With all products, investors participated in the performance of a basket of European
shares with a high dividend forecast. The baskets constituents were selected following suggestions from the issuing banks research units. Investors could choose between
a structured product denominated in Swiss francs or in euros depending on their willingness to face - besides the market risk of the stock basket - also the EUR/CHF FX risk.
This investment had two main risk sources. If it was denominated in euros, the EUR/CHF risk held and one faced the market risk of the large European companies whose
shares comprised the basket. Most investors classified the FX risk as acceptable since
a significant further strengthening of the Swiss franc against the euro would meet with
counter measures from the SNB. More specifically, a tracker on a basket of fourteen European stocks was issued. The issuance price was fixed at EUR 98.75. As of 1 April
2015 the product was trading at EUR 111.10 (mid-price) - equivalent to a performance
of 12.51 percent pro rata. Similar products were launched by all the large issuers.
Other issuers launched a tracker on Swiss stocks, putting all large Swiss stocks in a
basket that had only a little exposure to the Swiss franc, but which also faced a heavy
price correction after the SNB announcement in January. Again, the input of each issuing banks research unit in identifying these firms was key. The underlying investment
idea for this product can be seen as a typical application of behavioral finance: an overreaction of market participants to events is expected to vanish over time.
The risk in this investment was twofold. First, one could not know with certainty
whether the SNB would consider further measures, such as lowering interest rates further,
which would have led to a second drop in the value of Swiss equity shares. Second,
international investors with euros or US dollars as their reference currency could realize
profits since the drop in Swiss share values - around 15 percent - was more than offset
by the gain from the currency, which lost around 20 percent in value; roughly, an
institutional investor could earn 5 percent by selling Swiss stocks. Since large investors
exploit such opportunities rapidly, it became clear three days after the SNBs decision
was announced that the avalanche of selling orders from international investors was over.
352
4.7.4
CHAPTER 4.
Low-Barrier BRCs
Investors and private bankers searched for cash alternatives with a 100 percent capital
guarantee. The negative CHF interest rates made this impossible: if 1 Swiss franc today
is worth less than 1 Swiss franc will be worth tomorrow, one has to invest more than 100
percent today to get a 100 percent capital guarantee in the future.
Low-barrier BRCs - say, with a barrier at 39 percent - could be issued with a coupon
of 1 to 2 percent depending on the issuers credit worthiness and risk appetite for a maturity of one to two years. S&P500, Eurostoxx 50, SMI, NIKKEI 225, and other broadly
diversified stock indices were used in combination as underlying values for the BRCs.
The low fixed coupon of 12 percent takes into account that the product is considered as
a cash alternative with a zero percent, or even a negative, return.
Therefore, investors received, at maturity, the coupon payment - in any case - and
also 100 percent of the investment back if no equity index lost more than 61 percent
during the life-span of the product. If at least one index lost more than 61 percent,
the investor received the worst performing index at maturity, together with the coupon.
The risks of such an investment differ clearly from those of a deposit. For a deposit in
Switzerland, there is a deposit guarantee of up to CHF 100, 000. Furthermore, almost all
banks in Switzerland did not charge their clients the negative interest rate costs. Hence,
in this period a deposit is seen by many customers as less risky, albeit also with zero
performance before costs.
A low-barrier BRC, apart from issuer risk, has market risk. Can one estimate the
probability that one of the indices in a basket will lose more than 61 percent in one
year? One could simulate the basket and simply count the frequency of events leading
to a breach. Such a simulation has the drawback that one needs to assume parameters
for the indices. Another method would be to consider the historical lowest level of such
a basket - that is to say, what was the maximum loss in the past if one invested in a
low-barrier BRC? Using data going back to the initiation of the indices, no index lost - in
one year - more than 60 percent . This was the rationale to set the barrier at 39 percent.
This is obviously not a guarantee that this statement will apply also in the future, but it
helps investors to decide whether they accept the risk or not. Although this discussion
has concerned a BRC on equity, a similar discussion applies to such convertibles that
have currencies and commodities as underlyings. Relevant political and market events
in the recent past - and to which the above discussion also applies - occurred in October
2014 and, due to the European debt crisis, in August 2011. With regards to the former
set of events, the pressure on equity markets was due to uncertainty regarding Russia
and what would happen next in Ukraine; and on 15 October 2015 liquidity evaporated
in treasury futures and prices skyrocketed - an event known as the flash crash in the
treasury market.
4.7.5
353
Japan: Abenomics
As expected, the Liberal Democratic Party of Japan gained a substantial parliamentary

majority in the 2012 elections. The economic program introduced by the newly elected
PM Shinzo Abe was built on three pillars: 1) fiscal stimulus, 2) monetary easing, and 3)
structural reforms (Abenomics). Subsequently, the Yen (JPY) plunged versus its main
trading currencies, providing a hefty stimulus to the Japanese export industry. The issuer
of one product offered an outperformance structured product on the Nikkei 225 in quanto
Australian dollars, meaning that the structured product in question is denominated in
AUD and not in JPY, which would be the natural currency given the underlying Nikkei
225. This means that investors did not face JPY/AUD currency risk but if they were
Swiss investors, who think in Swiss francs, they still faced AUD/CHF risk. The term
quanto means quantity adjusting option.
Outperformance certificates enable investors to participate disproportionately in price
advances in the underlying instrument if it trades higher than a specified threshold value.
Below the threshold value the performance of the structured product is the same as the
underlying value. How can investors invest in an index in such a way as to gain more
when markets outperform a single market index investment, but still not lose more if
the index drops? The issuer uses the anticipated dividends of the stocks in the index to
buy call options. These options lead to the leveraged position on the upside (see Figure
4.22).
Figure 4.22: Payoff of an outperformance structured product.

The reason for using quanto AUD is the higher AUD interest rates compared to JPY
354
CHAPTER 4.
interest rates. Higher interest rates lead to higher participation and the participation
in the quanto product was 130 percent. The risk of the investment lay in whether
Abenomics would work as expected; and possibly FX AUD/CHF. The economic program
in Japan worked out well and the redemption rate lay at 198 percent after two years.
This redemption contains a loss of 16.35 percent due to the weakness of the Australian
dollar against the Swiss franc.
4.7.6
Market Events
The focus here will be on the credit risk of structured products. Although the examples
are presented under the heading of market events, the status of the market in the most
recent GFC and in 2014/2015 was the result of a complicated catenation of business
activities, policy interventions, and market participants reactions. The discussion below
shows that structured products with underlying credit risk offer, under specific circumstances, valuable investment opportunities to some investors. But the number of such
products issued is much smaller than the number of equity products. One reason for
this is that not all issuers are equally experienced or satisfy the requirements for issuing credit-risky structured products (necessary FI trading desk, balance sheet, and risk
capital constraints). Another reason is the lack of acceptance of such products among
investors, regulators, portfolio managers, and relationship managers, all of whom often
do not have the same level of experience and know-how as they have regarding equity
products.
4.7.7
Negative Credit Basis after the Most Recent GFC
Negative credit basis is a measurement of the difference in the same risk in different
markets. The basis measures the difference in credit risk - measuring once in the derivatives markets and once fixed in the bond markets. Theoretically, one would expect that
the credit risk of ABB has the same value independent of whether an ABB bond or a
credit derivative defined on ABBs credit risk is being considered. This is indeed true
if markets are not under stress - at which point the credit basis is close to zero. But if
liquidity is an issue, the basis becomes either negative or positive. In the most recent
GFC, liquidity was a scarce resource. The basis became negative since investing in bonds
required funding the notional while for credit derivatives only the option premium needs
to be financed. For large corporates, the basis became strongly negative by up to 7
percent. Table 4.14 shows how the positive basis in May 2003 changed to a negative one
in November 2008.
To invest in a negative basis product, the issuer of a structured product locks in the
negative basis for an investor by forming a portfolio of bonds and credit derivatives of
those firms with a negative basis. For each day on which the negative basis exists a cash
flow follows, which defines the participation of the investor. When the negative basis
vanishes, the product is terminated.

Corporate
Merrill Lynch
General Motors
IBM
J.P. Morgan Chase
Credit basis in May 2003 (bps)

47
-32
22
22
355
Credit basis in November 2008 (bps)

-217
-504
-64
-150
Table 4.14: Credit basis for a sample of corporates in 2003 and their negative basis in
the most recent GFC.
Example
Investing in the negative credit basis of General Motors (see Table 4.14) leads to a
return, on an annual basis, of 5.04 percent if the basis remains constant for one year.
If the product has a leverage of 3, the gross return is 15.12 percent. To obtain the net
return, one has to deduct the financing costs of the leverage.
Structured products with this idea in mind were offered in spring 2009 to qualified
investors. The products offered an annual fixed coupon of around 12 percent and participation in the negative basis. The high coupons were possible as some issuers leveraged
investors capital. This could only be offered by those few issuers in the most recent GFC
that were cash rich; typically AAA-rated banks. The products paid one coupon and were
then terminated after 14 months since the negative basis approached its normal value.
The product value led to a performance of around 70 percent for a 14-month investment
period. Was this formidable performance realized ex ante a free lunch - that is to say,
a risk-less investment? No. If the financial system had fallen apart, investors would
have lost all the invested capital. But the investors basically only needed to answer the
following question: Will the financial system and real economy return to normality? If
yes, the investment was reduced to the AAA issuer risk of the structured product.
Many lessons can be drawn from these products. A very turbulent time for markets
can offer extraordinary investment opportunities. The valuation of these opportunities
by investors must follow different patterns than in times of normal markets: There is
for example no history and no extensive back-testing, and hence an impossibility of
calculating any risk and return figures. But there is a lot of uncertainty. Making an
investment decision when uncertainty is the main market characteristic is an entirely
different proposition to doing so when markets are normal and the usual risk machinery
can be used to support decision-making with a range of forward-looking risk and return
figures. If uncertainty matters, investors who are cold-blooded, courageous, or gamblers,
and analytically strong, will invest, while others will prefer to keep their money in a safe
haven.
356
4.7.8
CHAPTER 4.
Positive Credit Basis 2014
The monetary interventions of the ECB and other central banks led to excess liquidity,
which was mirrored in a positive basis for several large firms. Monetary policy also implied low or even negative interest rates. This made investment in newly issued bonds
unattractive. To summarize, investors were searching for an alternative to their bond
investments, but an alternative that was similar to a bond.
A credit linked note (CLN) is a structured product. Its payoff profile corresponds to
a bonds payoff in many respects. A CLN pays - similarly to a bond - a regular coupon.
The size of the coupon and the amount of the nominal value repaid at maturity both
depend on the credit worthiness of a third party, the so-called reference entity (the issuer
of the comparable bond). This is also similar to the situation for bonds. But the size
of the CLN coupon derives from credit derivative markets. Hence, if the credit basis is
positive, a larger CLN coupon follows, as compared to the bond coupon of the same reference entity. CLNs are typically more liquid than their corresponding bonds since credit
derivative markets are liquid while many bonds, even from large corporates, often suffer
from illiquidity. CLNs are flexible in their design of interest payments, maturities, and
currencies. CLNs also possess, compared to bonds, tax advantages; in fact, the return
after tax for bonds that were bought at a price above 100 percent is in this negative interest rate environment often negative. The investor in a CLN faces two sources of credit
risk: the reference entity risk as for bonds, and the issuer risk of the structured product.
As an example, Glencore issued a new 1.25 percent bond with a coupon in Swiss francs.
Due to the positive basis, the coupon of the CLN was 1.70 percent. Another product
with, as the reference entity, Arcelor Mittal in EUR implied a higher CLN effective yield
compared to the bond of 1.02 percent in EUR.
Let us consider a more detailed example. Consider the reference entity Citigroup
Inc. The bond in CHF matures in April 2021 and its price is 102.5 with a coupon of
2.75 percent. The bond spread is 57 bps, which leads to a yield to maturity of 0.18
percent - an investor should sell the bond. The CLN has a spread of 75 bps, which
proves the positive basis and an issuance price of 100. The coupon of the CLN is - then
0.71 percent, which leads to a yield to maturity of 0.57 percent if funding is subtracted.
Therefore, selling the bond and buying the CLN generates an additional return of 75 bps.
4.8
The Investment Process and Technology
Asset management is more than just investment theory. Roughly, by knowing an investment strategy we do not know to which investor the strategy matches today and
at futures dates, we have not set up machinery that shows how the strategy can be
implemented and managed for many investors efficiently, and we do not know how to
export our AM capacity to other cultures and jurisdictions. These questions - regarding the appropriateness and suitability of the investment process, the linking of asset
4.8. THE INVESTMENT PROCESS AND TECHNOLOGY
357
management strategies to investment product solution providers, and the definition of

an infrastructure that links investors, the investment process ,and investment products
in a scalable, compliant, and investor-friendly form - define the global value chain of AM.
The investment process is part of the AM value chain. The chain has two layers: the
business and infrastructure level. Business Layer The business layer has the following
main functions (see Figure 4.23):
The front office.
The middle office.
The back office.
Product management.
Solution providers.
The front office consists of the distribution channel and the investment process. In
this part of the chain the investors preferences, risk capacity, and the type of investment
delegation (execution-only, mandate, or advisory) are defined. All communication to
end clients is made via this channel - new solutions, performance, risk reporting, etc.
The investment process, headed by the CIO, starts with the investment view applied to
the admissible investment universe. The view is then implemented by portfolio or asset
managers in portfolios where different procedures can be followed. More precisely, the
investment process has the following sub-processes for mandate clients:
Investment view by the CIO.
Tactical asset allocation (TAA) construction.
Implementation of the TAA by asset managers.
Matching of the eligible client portfolio to the implemented portfolios.
The middle office is responsible for reporting and for controlling the client portfolio
with respect to suitability, appropriateness, performance, and risk. The middle office also
constructs the eligible client portfolio. The back office is responsible for the execution
and settlement of the assets, which follows from the matching of eligible client portfolios
to the implemented portfolios.
Product management defines, for the investor, a suitable and appropriate offering.
Product management is also responsible for overall governance, such as market access and
regulatory requirements. The product management strategy tries to understand where
the market is headed, how this compares with current products, client segments served,
and firms capabilities, and how competitors price their services in different channels.
Product managers anticipate the people, process, and technology requirements for the
358
CHAPTER 4.
product and assess gaps versus current capabilities, and guide the remediation of these
gaps, all in a time frame that does not negatively impact the planned timing of the product launch. A main function is the new-product-approval (NPA) process office. The NPA
component guarantees both an optimal time-to-market and an effective implementation
of new products. Product management also oversees out- or insourcing opportunities in
the business value chain. Facebook could, for example, provide distribution services.
Investment process - solution providers. These provide the building blocks for implementing the portfolios. Such building blocks include funds, cash products (equities or
bonds), and derivatives.
4.8.1
Infrastructure Layer
The infrastructure layer naturally develops, maintains, and optimizes the IT infrastructure for the several functions of the business layer. The technology officer oversees the
developments in technology and data management and considers the out- or insourcing
opportunities along the infrastructure value chain.
Figure 4.23: Structure of the AM value chain.

To deal with the significant changes facing the industry, many leading companies are
looking at their businesses and operations anew, taking something of a blank sheet of
paper view of the world. Many outsourced important parts of their back offices (NAV
calculations, onboarding, investor statements, etc.), largely as a reaction to investor
pressure following the scandals, see Section 2.5.4. According to PwCs recently released
Alternative Administration Survey, 75 percent of alternatives fund managers currently
4.8. THE INVESTMENT PROCESS AND TECHNOLOGY
359
outsource some portion of their back office to administrators and 90 percent of hedge
funds behave in this way.
While the initial experience has been mixed in many respects, it has helped managers
consider those things that they do well and things that others could do better for them.
The recent regulatory demands have represented a significant shift in how they think
about their ability to continue to do things in the same old way - that is, by throwing
people at problems. Instead they ask Do I invest in internal capabilities and technology to
create a more agile organization to create investment or service values or do I outsource
parts of the value chain?
4.8.2
Asset Management Challenges
We summarize the different challenges for AM. Regulators are turning their attention to
asset managers. Therefore, the cost of compliance will not fall. Regulation places greater
demands on asset managers and the different asset management functions such as information disclosure, distribution channels, risk management and on the asset management
products. Some main facts and responses are:
The costs incurred when building up distribution networks and product manufacturing capabilities in the new world will continue to increase.
Fees will remain under pressure due to greater transparency and comparability.
Profits are 15 20 percent below their pre-crisis (GFC) highs.
The global battle for economies of scale continuous.
Technology is key for disclosure issues say making risk and return of portfolio pretrade and post-trade transparent, or to manage tax compliance for the customers
using the platforms who are related to many different tax jurisdictions.
Intermediaries in the AM value chain using commissions will be replaced by new
distributors - technological ones for example.
Anti-tax-evasion and anti-money-laundering measures are driven by the OECD.
After the Base Erosion and Profit Shifting (BEPS) report of 2013, asset managers operate in a world with country specific reporting of profits and tax paid. Therefore, offshore
financial centers try to have access to double tax treaties (DTT) which motivates asset
managers to use cross-border passports and reciprocities. But it also forces asset managers to decide in which location they want to be active and where they want to step back.
Finally, taxation needs to comply with the local jurisdiction and the jurisdiction
where the investors reside. The formation of four regional blocs in AM - South Asia,
North Asia, South Asia, Latin America, and Europe - creates opportunities, costs, and
risk. These blocks develop regulatory and trade linkages with each other based on reciprocity - AM firms can distribute their products in other blocs. The US, given the actual
360
CHAPTER 4.
trends, will stay apart since it prefers to adhere to its regulatory model. But integration
will not only increase between these blocs but also within blocs. There will be, for example, a strong regulatory integration inside the South Asia bloc. The ASEAN platform
between Singapore, Thailand, and Malaysia will be extended to include Indonesia, the
Philippines, and Vietnam. All these countries possess a large wealthy, middle-class of
potential AM service investors. The global structure UCITS continues to gain attraction
worldwide and reciprocity between emerging markets and Europe will be based on the
European AIFMD model for alternative funds. By 2013, more than 70 memoranda of
understanding for AIFMD had been signed.
The traditional AM hubs London, New York and Frankfurt will continue to dominate
the AM industry. But new center will emerge due to the global shift in asset holdings.
There will be a balance between global and local platforms. Whether or not a global or
local platform is pushed depends on many factors: Time-to-market, regulatory and tax
complexity, behavior and social norms in jurisdiction and the eduction level matter.
AM firms recruit local teams in the key emerging markets - the people factor. The education of these local individuals started originally in the global centers but will diffuse
more and more to the new centers in the emerging markets.
Demography, this means the presence of the baby-boomer generation will lead to a
phase of fully tailor-made asset decumulation since retirement lifestyles are fully individual as discussed in Section 2.2.2.
Due to the positive brand identities that tech firms have, they can integrate part
of the business layer into their infrastructure layer and offer AM services under tech
firm brands instead of more traditional banking or AM company brands (Branding
reversal). Finally, alternatives asset managers on one hand side offer new products
- asset managers move in the space banks left vacated - and on the other hand side try
that their alternative funds become mainstream. New products include primary lending,
secondary debt market trading, primary securitizations, and off-balance-sheet financing.
4.9
Trends - FinTech
FinTech defines the possible technological instruments to meet the challenges of the
financial industry. We mentioned in Section 2.5.2 the raise of FinTech investments in the
period 2010-2014 up from USD 1.4 billion in to USD 9.1 billion. The survey of McKinsey
(2015) for the sample of more than 120 000 FinTech startups states:
Target clients: 62% of the startups target private customers, 28% SMEs and the
rest large enterprises.
Function: Most startups work in the area of payment services (43%) followed by
loans (24%), investments (18%) an deposits (15%).
The Future of Financial Services (2015) (FFS) paper, written under the leadership of
the World Economic Forum (WEF) a identified 11 clusters of innovation in six functions
of financial services, see Figure 4.24.
4.9. TRENDS - FINTECH
361
Figure 4.24: The six functions (payments, market provisioning, investment management,
capital raising, deposits and lending, and insurance) and the 11 innovation clusters (new
market platforms, smarter & faster machines, cashless world, emerging payments rails, insurance disaggregation, connected insurance, alternative lending, shifting customer preferences, crowd funding, process externalization, empowered investors) (The Future of
Financial Services [2015]).
The approach of considering six independent intermediary functions and identifying
within these functions the eleven clusters is a silo business view. Leaving aside the six
functions the clusters can be grouped into six themes that cut across traditional functions:
Streamlined Infrastructure: Emerging platforms and decentralised technologies provide new ways to aggregate and analyse information, improving connectivity and
reducing the marginal costs of accessing information and participating in financial
activities.
Automation of High-Value Activities: Many emerging innovations leverage ad-
362
CHAPTER 4.
vanced algorithms and computing power to automate activities that were once highly
manual, allowing them to offer cheaper, faster, and more scalable alternative products and services.
Reduced Intermediation: Emerging innovations are streamlining or eliminating traditional institutions role as intermediaries, and offering lower prices and / or
higher returns to customers.
The Strategic Role of Data: Emerging innovations allow financial institutions to
access new data sets, such as social data, that enable new ways of understanding
customers and markets.
Niche, Specialised Products: New entrants with deep specialisations are creating
highly targeted products and services, increasing competition in these areas and
creating pressure for the traditional end-to-end financial services model to unbundle.
Customer Empowerment: Emerging innovations give customers access to previously
restricted assets and services, more visibility into products, and control over choices,
as well as the tools to become prosumers.
4.9.1
Generic Basis of FinTech
The discussion reveals that FinTech affects most financial intermediary value chains in
many different forms. Are there some generic elements which constitute a basis for this
variety of FinTech results?
The first one, the topology basis (see Figure 4.25), allows us to represent the dependency structure between different interacting agents. Consider a number N and a
number M of agents which interact. The agents can be traders, portfolio managers,
banks, regulators or end customers of the financial industry. Traditionally, the N and M
agents interact directly. FinTech then defines an interface between the N and M agents.
Such an interface or platform node essentially introduces a star-shaped topology in
the business connection between the agents. It can improve some shortcomings of the
direct traditional N M link situation.
Complexity reduction. Each agent N has to enter into M business links leading
to a total of N M links. With a platform, each agent has only one link whereas
the platform manages N M links. This reduces the complexity of all agents and
transfers this to a professional platform.
Business access. Suppose that some agents N are not able to have a direct link to a
partner M. This can be due to the size of M or N, one being too small for example,
or the costs to build a link are too expensive. Then a platform can offer such links
if it allows reducing unit link costs for example by exploiting the economics of scale.
Crowd funding is an extreme example. Without a platform no customer of the type
M (searching funding) can meet a type N (searching investing).
363
Quality improvement and efficiency. The information flow between the agents becomes more complex for many financial intermediary activities due to regulatory
requirements, increasing customer expectations or the integration of different data
sources. The need for example to integrate more refined customer suitability and
appropriateness data, to integrate market access compliance in the different jurisdictions boost costs for cross border asset management. A specialized platform can
reduce both costs and improve the quality of the services if it specializes on the
integration of market, customer, product and regulatory data sets.
Figure 4.25: Upper left panel: Topology if there are many bilateral links. Lower left
panel: Star-shaped topology if there is platform between the M and N clients. Upper
right panel: Traditional data processing and investment decisions if only a fraction of
the customer and market data are used (denoted by the red slices). Lower right panel:
Integration of all available data into a big data set (data model function) and subsequent
analysis and predictions for the customers.
A second building block of FinTEch acts on the data integration. Instead of considering data X and data Y separately, FinTech considers the product of X and Y . This
data integration defines a key step in big data, see Section 4.11.
Figure 4.25 illustrates the use of this basis for the process of investment advice.
The upper panel shows the traditional information flow for investment. In this business
model, first, only a part of all customers available information is used to define his or
her investment preferences. The data are collected by using questionnaires or online
applications. The data are updated with a low frequency. Second, only a part of market
data is used to produce investment products and solution.
364
CHAPTER 4.
The above client profile is then matched by an analytics engine to the possible investment solutions leading to the portfolio. This approach is so far mis-specified in many
respects. First, the customers preference data do not reflect customers emotions if say
market and possibly the portfolios are under stress. Second, the missing data might be
the most important ones to value a customers attitude towards profits, risks and losses.
Third, the method to collect customers preference data in a stylized environment is an
uncontrollable source of model risk. Fourth, the use of financial market data to construct
the different solutions and products is based on a few key figures such as expected or past
return, volatilities and correlations but there is no integration of the full information set
of the markets.
An approach which is half-way towards a FinTech solution is to integrate scenarios
for the market data. Then a customer can observe the reaction of the proposed portfolio
if some market parameters are shocked. Such scenarios introduce an important game
changer in the investment advice process: Customers start to explore the portfolios
and they face emotions due to the impact of the different scenarios on the portfolios
performance.
A proper FinTech solution integrates all meaningful data of different data sources.
This is not the same as the integration of all possible data, see Section Big Data. On
this comprehensive basis, analytics, market scenario generation, customer preferences
variations and new methods such as learning algorithms for forecasting can apply. These
two steps - data integration model and forecasting - can be used as a definition of big data.
The third basis element is information flow capacity. FinTech does not always define
a new business topology and therefore a new business model but can act on an existing
model by increasing the capacity of information flow such as an increase of calculation
power. This can lead to a next generation of algorithmic trading or investment strategies, it can allow defining a quantitative talent management system or it makes the robo
advisor possible.
Summarizing, FinTech can be seen as a composition of three operations:
Topology, this means the definition of the link-architecture for the information flow.
Data integrations, this means the construction of a comprehensive data base and
forecasting functionalities.
Capacity, this defines the power of the links and nodes to process information.
Regarding data, two pain points are the need to collect data from multiple sources
for certain assets and the requirement to process the disparate formats required. One
benefit of external providers is the existence of services that automatically aggregate the
data from multiple sources. This improves efficiency, allows for better analytics, and
reduces operational risks. The platforms that provide these services remove differences
365
in competitive strength between large and smaller institutions, allowing the latter access
to more comprehensive market data and information.
An example of such a platform provider is Novus. Today, almost 100 of the worlds top
investment managers and investors - managing a combined total of approximately USD
2 trillion - are using Novus platform. At its essence, Novus is a platform via which the
industrys top investors can collectively innovate. Novus aggregates funds performance
and position data. This defines a single point of access for asset managers. Using Novuss
automated platform, almost all worldwide funds and their performance are catalogued
and analyzed based on an automated collection of regulatory reporting data.
Figure 4.26: Platforms that enter the market as new intermediaries between small and
large investors (The Future of Financial Services [2015]).
4.9.2
Investment Management
FFS identified two investment management function: Empowered investors and process
externalization. Both, the empowered investors cluster and externalization can occur
disruptive or voluntary for the owner of the processes.
4.9.2.1
Empowered Investors
This includes automated advice, automated management, social trading, big data analysis, and retail algorithmic trading. The various insights follow from the question: How
will automated systems and social networks the business of investment management?
366
CHAPTER 4.
The cluster empowered investor issue is likely to threat traditional financial intermediaries: New entrants place pressure on margins, try to take-over parts of the value chain
and intensify competition among all players. This is possible since the digitalization of
functions reduces the effectiveness of proprietary distribution of asset management products. The demand for automated wealth management and asset management tools has
several roots.
The customer trust lost since the financial crisis is only recovering slowly.
The performance of many traditional asset management solutions in recent years
(mutual funds) and the performance of advisory activities in wealth management
vary significantly for different end clients in the wealth and asset management field.
Mass affluent clients face, in the best case, a standstill regarding the services they
receive - the traditional advisory channel is becoming less and less profitable for
wealth managers. This leads to an erosion of the mass affluent sector in favor of
automated, lower-cost solutions provided by disruptors - from automated wealth
management services to social trading platforms. These disruptors have emerged to
provide low-cost, sophisticated alternatives to those services provided by traditional
wealth managers.
Empowered investors means for the AM industry:
Intuitive and affordable tools.
Exploring products and solutions via scenario analysis and what-if functionalities.
Some sophisticated investors can act as investment experts. They can sell or share
their investment expertise on social trading platforms. This diversification of minds
will attack the quasi-monopolistic position of CIO in financial institutions.
4.9.2.2
Process Externalization
The process externalization issue refers to the opportunity to gain access to new levels
of efficiency and sophistication. Mid-sized institutions can for example secure access to
sophisticated capabilities without making large infrastructure investments, which are out
of scope both from a financing and a know-how perspective. Another driver is organizational agility which will become critical to sustaining competitiveness as high-value
capabilities continue to be commoditized.
Process externalization providers are using highly flexible platforms that increase the
efficiency of an institutions business. Financial institutions have, therefore, to reconsider
both the core part of their value chain and the non-core parts that they can externalize.
Overall, externalization means that traditional financial intermediaries keep those parts
of the value chain that are more exposed to the human factor - such as analytics or
decision-making - and outsource those parts of the chain that can be automatized.
367
Many processes within financial institutions are core to their value chain. However,
process externalization providers use flexible platforms (based in the cloud) to provide
financial institutions with increased efficiency, excellence and sophistication. FFS classifies the different innovations in process externalization as follows:
Innovations enabling process externalisation are:
Advanced analytics. Using advanced computing power, algorithms and analytical
models to provide a level of sophistication for the solutions.
Cloud computing to improve connectivity with and within institutions. This allows
for simpler data sharing, lowers implementation costs. streamlines the maintenance
of processes, and enables real-time processing.
Natural language leads to more intuitive processes for end users.
Kensho models investment scenarios for decision-making fully automatized. The cost per
generated scenario are much lower than those few manually generated scenarios. Using
Kensho, institutions can shift their resources away from the management of processes
to functions with higher value for the asset managers and where the asset management
firm has comparative advantages; see Figure 4.27. Kensho threatens the ability to model
market projections and hypotheses by quants of large financial institutions by offering
next-generation tools, application, technology and data bases. Common models of the
process externalisation providers are:
Platform, real-time databases or expert systems, leverage automation for the users
and the solution providers.
As-a-service reduces infrastructure investments to a minimum level by externalization.
Capability sharing between institutions frees them to build up all possible capabilities and allows integration of different legal and technical standards.
Process externalization means for the AM industry:
AM firms use advanced technologies to externalize, consolidate, and commoditize
processes in a more efficient and sophisticated manner.
Core competencies that differentiate winning financial institutions shift from process
execution to more human factors.
External service providers give small and medium-size asset managers access to
sophisticated capabilities that were not previously attainable due to lack of scale.
By enabling small and medium-size asset managers to access top-tier processes,
barriers to entry are lowered for new players, and smaller existing players are able
to compete with large incumbents on a more level playing field.
368
CHAPTER 4.
Figure 4.27: Externalization of non-core processes leads to a more homogeneous quality

level with regards to the processes of the asset management firm (The Future of Financial
Services[2015]).
Cross-border offering become profitable with well controlled conduct and regulatory
risk due to the new platforms. But it could also amplify the risks of non-compliant
activities and unclear liabilities when centralized externalization providers fail. The
automatization also increases the speed at which financial institutions implement
regulatory changes. Therefore, regulators will receive faster consistent inputs from
financial institutions.
Since more capabilities, technologies, and processes are externalized, asset management firm becomes more dependent on third parties, lose negotiating power and
continuity.
The constantly evolving regulation across geographies means an increase of compliance
resources require solutions about regulation and its changes which is consistent within
and across different jurisdictions. New entrants are able to interpret regulatory changes
and translate them into rules. Such a rules based approach is scalable and allows asset
managers responding fast to regulatory changes, see Figure 4.28.
FundApps is such a FinTech firm. It organizes regulatory information from various
sources, and delivers a cloud-based service that automates shareholding disclosure and
monitors investment restrictions across over one-hundred regulatory regimes. FundApps
partners with a global legal service provider to monitor and translate changes in relevant regulations into rules on a daily basis. If regulatory agencies partner firms such as
369
FundApps in the future, they could ensure consistent compliance across financial institutions, make dissemination of regulatory changes in disclosure regimes faster, and reduce
the compliance burden faced by the industry. FFS.
Figure 4.28: Risk and compliance platforms allow for an essential reduction of the network topology between asset managers and regulators (The Future of Financial Services
[2015]).
4.9.3
Market Provisioning
The development of smarter, faster machines and new platform types will change how
buyers and sellers trade and how they have access to information. FFS identifies the
three following areas of new market provisioning innovation.
4.9.3.1
Machine Accessible Data - Event Driven
The goal is to discover major events faster than news channels can cover them, and
to do so by using social media/sentiment analysis. This race for low latency will also
shift to access to real-life events, leveraging faster connection to and interpretation of
traditional and emerging news sources. FFS. This will lead to faster and better forecast
for investment managers and the integration of real-life events into investment strategies.
4.9.3.2
Big Data - Comprehensive
See Section 4.12.
370
CHAPTER 4.
4.9.3.3
Artificial Intelligence/Machine Learning - Automated
The idea is automatize decisions based on advanced analytics using extensive data sets
and machine learning to self-correct and continuously improve investment and trading
strategies.
4.9.3.4
What Impact Will Better Connect Buyers and Sellers?
The qualitatively improved information flow between market participants across new
information/connection platforms allows the industry to optimize their decisions for their
clients. But new platforms allow also for more and better connections between buyers and
sellers by simpler access to information flow or lower search costs for potential counterparties: Smaller intermediaries can become partner of larger institutions, see Figure 4.29.
Figure 4.29: New platforms connecting individual buyers and sellers (The Future of
Financial Services [2015]).
4.9.4
Trade Execution - Algo Trading
Asset managers need to rebalance their portfolios regularly. This means buying and
selling cash products, derivatives, ETFs or other financial products. The order to trade
received by the traders can be executed in three different ways:
Using algorithmic trading via a broker.
371
Direct market access via a broker.

Sponsored market access with no broker in between.
The number of assets which have to be traded often are large. Trading large sizes without a meaningful trading strategy results in unwanted price feedbacks - high frequency
trader will fast detect that a large amount of say stocks should be sold and they will
then attack such an order which drives prices up. The asset manager wants to avoid such
scenarios. Algorithmic trading or algos are designed to lead to execution prices for large
orders which should be as close as possible to small trading size prices - a large asset
manager gets execution prices which are the same as those of a small price taker.
The design of algos faces the following challenges today. First, markets are fragmented. Some stocks can be traded a one or even two dozens of trading venues. Which
one should one consider in the algorithmic rules? Only the large ones which a deep
liquidity but the higher risk of being attacked by high frequency traders? Second, speed
is key. High frequency trader post and withdraw several thousand quotes for one stock
within one second! Speed is that important such that an arctic fibre high performance
project of length 15600 km which connects Tokyo and New York is realized to eliminate
maximum trading size restrictions. Third, flash crashes due to algorithmic and/or high
frequency trading arise regularly in different markets. This means that the flash crash
May 6, 2010, is a particular event in the depth of the crash but similar less pronounced
events happens regularly. Furthermore, to understand the causes of such crashes is difficult and possibly even impossible. The SEC aims to provide a final report about the
2010 flash crash in 2017. Fourth, algo trading and high-freuency trading are competing
against each other. We make this last point more explicit.
Theoretical and empirical suggest that a square-root function characterizes the market impact as a function of the order size. More precisely, the market impact defined
by
Market Impact = Average Price Paid Midprice at Order Arrival
(4.17)
is equal to
Market Impact = a1 Spread + a2 S
(4.18)
with the price volatility and S the order size. Therefore, if volatility is high, market
impacts from executing the algos is large too. But large market impact is likely to be detected by high frequency traders which then intervene as follows. Large market impacts
represent opportunities for the high frequency traders since they take an advantage on
very small timescales from price discrepancies which occur when market impact is high.
Algo trading consists of three steps: What is the strategy doing, when are order
placed and where are they routed in case of several available trading venues. The different strategies can be grouped as follows. The schedule based algos follow a strict
rule to execute the orders. Volume weighted average price (VWAP) algorithm and the
372
CHAPTER 4.
time weighted average price (TWAP) are the most well-known examples. VWAP which
is preferred over TWAP is a benchmark used in particular for pension funds or large
mutual funds. The VWAP price equals the average share prices times the volume over
one period (typically a day) divided by total volume. Since volume is not known ex
ante, a predictor for the volume function is needed which typically is U-shaped: Higher
volumes in the opening and closing and lower volumes at lunch time. TWAP trades in
fixed time intervals an amount of the shares such that at the end of the day the total
order is executed. Since regularly trading each 5 minutes a large amount of stocks defines
an easy to detect pattern, high frequency traders are likely to attack a TWAP. A second
group of strategies are the liquidity based ones. The most important is the participation
strategy - orders are placed in fixed chosen proportion to the actual market activity.
If say 1 million shares are traded in one day a 10% participation strategy will place 1
million shares for trading. This strategy can be used to sort out temporal imbalances
between supply and demand.
The two other groups are optimization algos and auctions or customs.
4.10
Trends - Big Data
Big data: What is it and what might it mean to investment managers? Although it is a
hot topic there seems to be little agreement on what it is. At the Davos World Economic
Forum 2014 some participants claimed that big data is a new asset class. This is an unnecessary emotional exaggeration that also fails to explain what big data is. The sources
in this section are Lin (2015), Roncalli (2014), McKinsey Global Institute (2011, 2013),
Varian (2013), Hastie et al. (2009), Harvey et al. (2014), Novy-Marx (2014), Bruder
et al. (2011), Freire (2015), Fastrich et al. (2015), Zou (2006), DeMiguel et al. (2009),
BARC (2015), and Belloni et al. (2012).
4.10.1
Definitions
McKinsey Global Institute (2011) gives the following definition.

Definition 4.10.1. Big data refers to data sets whose size is beyond the ability of typical
database software tools to capture, store, manage, and analyze.
Big data is about combining data which can be from many sources and unstructured.
The techniques in IT such as high-performance computing, data mining and analytics,
machine learning exist since many years and one might ask why big data has become
a hot topic. First, the amount of collected data is increasing in the last years due, for
example, to the use of social networks and increasing online shopping activities. Second,
while in the past code was often kept secret, the open source concepts becomes more and
more accepted which increases the development rate of new. Finally, the cost decline for
hardware turned it into a commodity. Following Lin (2015), the main big data use cases
are:
4.10. TRENDS - BIG DATA
373
Big data exploration which means to find and treat all necessary data for a better
decision making using for example visualizaton.
Extension of the customer views by integrating all possible internal and external
data sources with customer data.
Extension of IT security and intelligence service.
Analytics of operations data to reduce risks and costs.
Increase operational efficiency by integrating data warehouse capabilities and big
data.
Visualization of data in particular network-type data is a key input for business decision
making. Consider for example a table which displays a linear regression of asset returns
- the value of a one-dimensional chart visualization is obvious. Consider for example a
firm with 200 000 employees and many different connections between the employees such
as connections from projects, email traffic, blogs, hierarchy for example. To visualize the
network with 200 000 nodes and typically a multiple of different-type connections requires
powerful IT tools. Using software tools the well-known visualization of networks follow.
But what are the business relevant questions or why can it be useful to invest into
a software which can visualize the different type business connections? Questions of
relevance to the firm and the employees are:
Show me the shortest path to an employee with specific skills which I need now
in my job. Provide me with aggregated information such that I can easily check
whether the employee formally satisfies the skill requirements.
Rank the employees with the largest network betweenness, this the centrality of the
employee in the network. The more central an individual is, the better is this for
projects since the employee serves as a bridge, the higher are such employees often
valued and compensated but centrality has also drawbacks since such employees
represent bottlenecks (everybody wants my resources) and if they leave the firm
and failed to develop successors, the centrality node becomes a central hole.
Show me the strength of my links. If some links are very strong this may be normal
due to our functions but it can also be a signal that communication is not efficient
between myself and the other employee.
Illustrate the dynamics of the firm network in the last years.
There are estimates about the dollar value of the links in a firm - see BusinessWeek
(2014) for example. Figure 4.30 show the reference model for visualization and visual
analysis. The start - raw data - are first transformed into abstract data which is called
data mining. Filtering the data in possibly many dimensions, the visual form follows.
374
CHAPTER 4.
Figure 4.30: Visualization and visual analysis reference model (Lin [2015]).
This form is typically not well-suited or easy enough for decision making. Therefore, in
the final step using rendering the final client need adapted view is generated.
The market for big data increased from USD 7.3 to 38.4 billions in 2015 (wikibon.org).
The revenues for the vendors of big data are split into big data hardware revenues,
- software revenues and - service revenues. Large IT firms such as IBM, HP or Dell
dominate in absolute revenue terms while the contribution to these firms total revenues
is at a low one-digit level. New firms which one-hundred percent big data revenues in
the league table are for example Palantir and Pivotal. The failure to handle data using
typical software can have different sources:
The data set is extremely large;
The data are not structured;
The analysis of the data requires specific research and decision-making;
The analysis of the data needs a high level of IT or statistical skills.
These different sources suggest that big data issues are different for different industry sectors - that is to say, there will not be a single scientific answer regarding the
management of big data.
A popular model for big data is the so-called 3V Gartner model of big data:
Volume (amount of data). These can be records, tables, files or transactions.
375
Velocity (speed of data). There are different velocity scales such as near time, batch
or streams.
Variety (data types). Data can be structured, unstructured or in a mixture format.
An operational definition of big data defines different transformations how one reaches
from the starting raw data sets to the decision information. The challenge in big data is
to transform raw data into structured and informative data. The logic of this transformation has two steps.
First, raw data represented by X are transformed into model variables Y = f (X) using a function, f . This function can be averaging, aggregating, conditioning, or creating
new classes in the original data set X. The second step is to make a prediction, defined
by a function - g, that takes the model variables Y into the predicted variables Z = g(Y ).
In practice, the first step is often more challenging and more important than the
second. One problem in the first step is the construction of the raw data X. This means
that given structures of databases should be changed to generate X. But they are often
not flexible and data are located in different databases or are not complete. Hence, the
first step in big data is a traditional data issue: big data is equal to the sum of different
databases where the quality, flexibility, and integrity of the data define the characteristics
of the big data set. The construction of the data set X faces the following challenges:
Integration of multiple sources of data that exist in different formats;
As yet non-digital data;
Unstructured data;
Incomplete data sets.
Examples
There are many databases for hedge funds, including HFF, Morningstar, Lipper,
and BarclayHedge. A big data issue would be to construct a comprehensive data set X
by merging all these different databases. A second big data issue would be analytics on
X, i.e. f (X), where the goal is to detect alpha sources.
How much performance can be gained with such an approach? Many studies at
investment management firms have applied big data methodology search for the parts
in the investment process which could most benefit from a big data approach. These
studies found performance improvements in the range 20 to 250 basis point (bps). The
main challenges are the identification of weak parts of the processes and then, the
construction of X. This data challenge requires flexible platforms which are able to
376
CHAPTER 4.
collect, aggregate, analyse and report data across many regulatory regimes, and formats
in real time.
Another example of machine learning leveraging big data methods is driven by
venture capital (VC). A Hong Kong based VC firm appointed a machine intelligence to
the BoD. The new board member continuously analyzes anything related to investment
to identify and value potential investments.
A pitfall in big data is to assume that all possible data should be collected. It is
preferable to not possess data if it is not clear how those data should be used. Simply
collecting without a clear purpose raises the risk of losses due to violations of confidentiality or to loss of strategic data.
Examples Total Information Awareness (TIA)
Using data-mining, the risk is that meaningless patterns are observed which then can
lead to decisions which do not have any true data basis support. An example of such an
objection against TIA, which Professor Ullman attempted to explain to a reporter from
the Los Angeles Times (but he failed to succeed), was the proof that it is impossible
to track single terrorists if one undertake a large data exercise where the data set is
generated at random. We follow Freire (2015).
Ullman assumes that groups of terrorists meet occasionally at random in hotels to
plot doing evil. The big data task is to find people who stayed in the same hotel at the
same day for at least two times in the last three years - this is the search task. The
following back-of-the-envelope calculations gives an impression how impossible it will be
to find the individuals which fulfill the search requirement. The first step is calculate
the number of possible suspects. There are 9 billion individuals and there are 10 000
days, each person stays for 1% of the time in the hotel and each hotel has the capacity
to host 100 individuals. As usual, such back-on-the-envelope calculations are robust
against variations the precision of the information used compared to the true values.
The probability that two individuals will meet at the same day in the same hotel is
1 1 1
1
= 9 .
5
100 100 10
10
The probability that two persons will meet twice in the same hotel at the same two days
is then 10118 . How many pairs of days are there in 10 000 possible days? The answer is the

same as how many possibilities are there to put 2 balls into 10 000 boxes: 1000
which is
2
approximatively 500 000 = 5 105 (using the factorial definition of binomial coefficients).
The possible number of pairs of people in the 9 billion individuals is then similarly equal
377
to 1017 . Hence, the probability that two individuals will be at the same hotel on some
two days is
1
1
5 105 18 = 5 13 .
10
10
0
This implies that there 250 000 pairs of people which are expected to be suspicious:
1017
1
= 2500 000 .
1013
Given this number of possible, expected suspects and assuming that say 5 pairs of terrorists which met in the above described way, then the exercise is to find out the 5 from
the 2500 000 possible one which is impossible. The moral of the story is that one should
not consider properties where random data will then produce for sure a large number of
facts of interest. The property here is that one is interested in two individuals which
stayed at the same hotel twice. This event generates the large number of suspects if we
assume that the individuals behave in a random way.
When it comes to leading a project for the construction of the database, X, it is
important to select individuals with the right skills. In the past, statisticians would lead
data integration projects but in recent decades that role has been taken over by IT people
due to the increasing complexity and number of data concerned. But for big data, data
scientists often successfully run projects. Since they are trained both in IT or computer
sciences and in modelling as well in analytics they are best equipped to solve any possible chicken-and-egg big data problems: one needs data to test a model and one needs
a model to design the database if one does not have experience regarding the prediction
model, g, and the data transformation.
Hal Varian, an emeritus professor of the University of California at Berkeley, now
serving as Chief Economist at Google, stated, in 2009:
I keep saying the sexy job in the next ten years will be statisticians. People think Im
joking, but who wouldve guessed that computer engineers wouldve been the sexy job of
the 1990s?
The disruptive nature of big data led him to conclude, in 2013:
I believe that these methods have a lot to offer and should be more widely known and
used by economists. In fact, my standard advice to graduate students these days is go to
the computer science department and take a class in machine learning.
Example
378
CHAPTER 4.
The start of Faster Payments Scheme in the UK in 2008 increased online banking
fraud losses. The defense systems not only fail to detect all attacker correctly but they
also identify false positives. There is therefore an increasing cost burden in order to
minimize this alpha and beta losses. In payment systems big data analytics used to fight
the criminals by the very nature of the services has to be real-time.
Example Penalty approaches in portfolio optimization

We discussed in Section 2.7.2.1 how the unknown parameters and C can be estimated and how estimation risk or uncertainty is considered. The literature largely
documents that sample estimates do not provide for real life implementation reliable
out-of-sample asset allocations. Michaud (1989) for example points out what he calls
the error-maximizing characteristic since the optimization overweights assets with large
estimated returns, negative correlations and small variances. But this are also the assets
where the estimation errors are most likely large. Since often estimation errors in the
expected return estimates are larger than those in the covariance matrix estimates, we
consider the minimum-variance portfolio (MMV) where only the covariance but not the
returns matters. solves
min 0 C , s.t. : e0 = 1 .
RN
(4.19)
The lower estimation risk of the covariance matrix can even result in a better
out-of-sample performance than for portfolios where the expected mean is considered.
In summary, MMV portfolio can badly perform out-of-sample and furthermore, the allocation can be unstable or extreme such that asset managers do not consider it meaningful.
Asset managers prefer to select the assets from an as large as possible asset universe
with N the number of assets. Unfortunately, this requires to estimate N (N + 1)/2
parameters for the covariance matrix with the respective estimation risk, see Section
2.8.4.11.
One particular method to reduce estimation risk specific is the so-called Least Absolute Shrinkage and Selection Operator (LASSO) approach of Tibshirani (1996). There is
empirical evidence that this approach provides higher out-of-sample performance, higher
Sharpe ratios more stable and spares portfolios. This approach is often preferred since the
optimization problem is solvable in one step and the optimization problem is still convex
- therefore any local numerically found minimum is a global minimum. The optimization
problem reads when we consider the MV approach with a return constraint:
min 0 C +
RN
N
X
|j | , s.t. : e0 = 1 , 0 r .
(4.20)
j=1
Intuitively, deviating from the zero vector is linearly punished for negative or positive
379
deviations. In low dimensions, one superimpose to the squared function a V-type

function. Therefore, small values of which result without the additional constraint
eventually are reduced to zero. This extra gained investment part is then distributed to
the other investment components which finally results in a sparser investment vector.
There are many different variants of the LASSO approach which we do not comment
about, see Fastrich et al. (2013) for a discussion, except for the so-called adaptive
LASSO approach, see Zou (2006). To counteract some biases inherent in (4.20), Zou
proposed to vary the absolute value penalty individually where the weights follow from
an OLS estimate.
In order to compare the Lasso-type models we introduce to the interpretation of the
information matrix C 1 due to Stevens (1998). The author derives an expression for
the information matrix in the Markowitz model by applying a method from hedging in
futures markets. He consider the OLS regression of asset i return Rt,i on the return of
all other assets Rt,i except asset i plus a noise term which is normally distributed with
mean zero and variance i2 :
Rt,i = 0 + i0 Rt,i + i,t .
These assumptions allow to express the information matrix elements as ratios between
the estimated betas and the unhedgeable risk of the regression. Stevens states: (1)
the set of coefficients obtained by regressing the excess return for a given asset i on the
excess returns of all other risky assets; and (2) the residual variance for that regression,
which is equal to the nondiversifiable or unhedgeable part of each assets i variance of
return i2 (1 Ri2 ). It is of some interest to note that everything in C 1 relates to the
characteristics of the N regressions that minimize each assets residual variance, which,
for good reason, may be termed the optimal hedge regressions. Then,
i
i i0 i
i2
where we note Ri2 = 1 i2 . The better the hedge, i.e. the larger the R2 of the
i
regression, the smaller is the denominator in the optimal policy and therefore the more
weight the asset receives. But a high R2 means that the asset i is strongly correlated to
the other assets. Since this property enters into the denominator of the optimal policy
yet small variations of the dependence create strong variation in the optimal policy.
This show why strongly correlated assets are a source of instability of mean-variance
optimal portfolios. The difference in expected return in the nominator can be positive
or negative. Therefore, an investor is long in asset i if the expected return of this asset
is larger than the return of all other assets and similarly for a short position.
Bruder et. al (2013) compare the OLS-mean variance approach with the LASSOmean variance one for the S&P 100, with monthly rebalancing and data starting Jan
2000 to Dec 2011. Table 4.15 shows the results.
380
CHAPTER 4.
Method
OLS-MV
LASSO MV
Return
3.60%
5.00%
Volatility
14.39%
13.82%
Sharpe Ratio
0.25
0.36
Max. Drawdown
-39.71%
-35.42%
Turnover
19.4
5.9
Table 4.15: OLS-mean variance versus LASSO-mean variance (Bruder [2011])
The LASSO approach shows a better risk adjusted performance than the traditional
one. The extreme losses are comparable in both approaches - this means that the
LASSO approach does not provides and form of a tail hedge. But the turnover is
much smaller for the LASSO approach than for the traditional one. This is a direct
consequence of the fact that the LASSO approach leads to a spare optimal investment
vector and also to a sparse information matrix. The stock Google is for example hedged
in the OLS model by 99 stocks compared to 13 stocks only in the LASSO model.
Having described the economics, what has this to do with big data? If we consider the
asset universe S&P 100, the problem is not that data need to be integrated from different
data sources but it is the dimension of the problem - analytics - which requires powerful
software tools. Take MSCI world with around 10 500 stocks. The LASSO approach
requires a numerical optimization and if the above LASSO approach is altered, where
the convexity of the problem can be lost which guarantees that a local minimum is indeed
a global one, then one has to use advanced algorithms to find the global minimum where
a hugh dimensional covariance matrix has to be numerically inverted.
4.10.2
Risk
The explosion in banking access channels and real-time payment methods in the last
years put security issues center state in FinTech and big data. This huge increase in
volume is abused by criminals using different types of malware (worms, viruses or Trojan
horses for example), phising or other methods of attack. The fight against payment fraud
is for example one of the greatest challenges for financial institutions worldwide.
Example
The start of Faster Payments Scheme in the UK in 2008 increased online banking
fraud losses. The defense systems not only fail to detect all attacker correctly but they
also identify false positives. Hence an increasing cost burden follows to minimize this
alpha and beta losses. In payment systems big data analytics used to fight the criminals
by the very nature of the services has to be real-time.
4.10.3
381
Survey
The BARC Research Study (2015), with 431 participants, is a survey of different issues
concerning big data. The surveys questions consider the following topics:
What are the benefits for companies from their big data analyses?
Which business and technical problems and challenges do companies encounter?
How do firms finance their big data projects?
Which technologies are actually used and which will be used in the future?
The key findings of the study are as follows. Currently, only 17 percent of companies
surveyed believe that big data is not useful and will not be important and more than 40
percent of the companies surveyed have experience with big data. The main motivations
for using big data are the mastering of large amounts of data (57%), integrating different
data structures (50%), obtaining faster and better analytics (55%), and the desire to
obtain more refined estimation techniques (51%). In 2016, management is the driver for
big data projects since they expect:
Better strategic decision-making (69%) ;
Better process control (54%) ;
Better understanding of the customers needs (52%) ;
Cost savings (47%) .
Corporates that are able to quantify the benefits of big data estimate an average
increase in output of 8 percent and cost reductions of 10 percent. Big data initiatives
help to provide a comprehensive picture of customers by making their entire interaction
with the company transparent. Marketing and sales are therefore, today, the pioneers in
big data - in 15 or 23 percent of the companies in which big data is at least conceivable.
Hence, the customer is at the center.
The main problems with big data are (i) lack of knowledge of how to apply big data
analytics, (ii) data security (49%) and (iii) a lack of big data know-how (53%)- around
one-third of the companies surveyed intend to create new jobs in the big data field.
Another finding is that Europe is lagging behind the US. Around 50 percent of the US
firms in the survey are using big data already or are engaged in projects. In Europe, the
number is 16 percent. The following tools are used in the industry: Standard BI tools
(62%) and standard relational databases (53%); analytical databases (36%), predictive
analytics (31%), the Hadoop Ecosystem (17%), big data appliances (14%), and NoSQL
databases (13%).2
2
SQL [Structured Query Language] is a special-purpose programming language designed for managing
data held in a relational database management system, or for stream processing in a relational data
382
CHAPTER 4.
In the financial industry, big data is part of the business processes of 20 percent of the
firms surveyed; another 22 percent have started pilot projects and the rest either have
not made any efforts with big data (45%) or think that the topic is not worth looking
into (13%).
4.11
Trends - Blockchain and Bitcoin
Blockchain are a technology and Bitcoin is a so-called crypto-currency which uses blockchain
technology.
4.11.1
Blockchain
A blockchain is a digital record keeping system - a digital ledger which is a database that
digitally tracks, records and stores information. A blockchain consists of time-ordered
chain of blocks and each blocks is defined by a set of verified transactions of ownership
rights. New transactions are grouped into a new block and after its validation - the consensus work to install unambiguous asset ownership - the block is added to the existing
blockchain. Each block is further marked with a timestamp and a digital fingerprint of
the previous block. This digital fingerprint - called a hash - identifies a block uniquely
and the verification of the fingerprint can be easily done by any node in the network.
Technological, a blockchain is a network of computers and the central ledger act as the
custodian of the transaction information.
Summarizing:
Fact 4.11.1. A blockchain or mutual distributed ledger has three characteristics. First,
ownership which originally is assumed to be a public property. Second, the technology
(distributed) which consists of a system of distributed servers. Third, the object which is
the ledger.
A blockchain mechanism ensures that ledger contains only valid transactions, that
every network user can trust that her copy of the ledger is the same as for all other users
and that the ownership rights are assigned correctly. In the case of Bitcoin, the system
should avoid the possibility that users spend 1 Bitcoin twice. We refer to Duivestein
et al. (2016), Tasca (2016), Aste (2016), Rifkin (2014), Swan (2015), Peter and Panayi
(2015), Davidson et al. (2016), UBS (2015), Nakamoto (2008), Franco (2014), Bliss and
Steigerwald (2006), Peters et al. (2014), Zyskind et al. (2015).
The blockchain consensus mechanism is different than the usual consensus mechanism
in the banking industry where trusted third parties matter such as central banks which
stream management system. Source: Wikipedia. NoSQL means non SQL or non-relational or tabular
database. The Hadoop Ecosystem provides a software framework for processing vast amounts of data in
parallel on large clusters of commodity hardware (potentially scaling to thousands of nodes) in a reliable,
fault-tolerant manner.
4.11. TRENDS - BLOCKCHAIN AND BITCOIN
383
validate contractions, central counter parties in OTC business or credit card companies.
The idea in the blockchain is often (but not always) to replace the centralized consensus
institutions by a decentralized one using the internet.
Why should one move trust in business transactions away from centralized third parties to decentralized, distributed ones? One reason is cost efficiency. A second one is
more security - a blockchain can be more secure than a trusted third-party system. A
third strategic one is to move to the so-called crypto-economy. Like money which can
be digitalized and encrypted (Bitcoins), a blockchain can in principle do the same for
all types of intangible assets such as contracts defining shares or mortgages. To which
extend blockchains will be applied in the financial industry depends on the details of
the objects under consideration. Legal restrictions could for example make a specific
application impossible.
Trust is a key concept in banking. There are different types of trust. Trust between
counter parties to fulfill the contractual obligations or the trust level in the medium of exchange - money - are two examples. For payments, blockchain technology allows to switch
from trust in central banks to trust in a network such as for the crypto-currency Bitcoins.
In OTC contracts, trust is defined between the two trading parties in any contract.
If OTCs are centrally cleared, trust is between the central counter party and one party
of the trade (star shaped topology). In a public blockchain, trust is in the network: No
bilateral trust between two acting parties is needed. It is not clear in which scenarios
such as zero-bilateral-trust scheme is per se less risky, less fragile or more efficient than
the traditional networks.
A blockchain network is crowded by strangers. How can an agent trust them? Traditionally, there are trusted intermediaries such as eBay or the strangers possess a peer
reputation such as airBnB. While the latter one satisfies the needs if we consider accommodation, the peer reputation approach is to fragile for financial contracts. Information
technology allows the use of peer-to-peer systems with blockchains one example. Aste
(2016) states:
How can we trust strangers (P2P parties) without intermediation of an authority?
With peer validation in a transparent system that keeps record of all relevant information.
The Coin of Yap problem, a problem which the population in the Yap islands faced
in Western Pacific Ocean, shows some similarities to the blockchain trust issue for the
crypto-currency Bitcoin. The Yaps produced stone money. There were five different
sizes of stones where the largest one needed around 20 men to be transported. It was
not possible to carry the stones from one island to the next one for exchange reasons
384
CHAPTER 4.
using the canoes. How could one use the stones for payment if they could not be physically exchanged against the goods? The solution was to store the ownership information
in the consciousness of the Yap people (the blockchain): The Yap knew who owes the
different stone pieces. They did not need to move them when ownership changes since
the public memory redords the changes in ownership. Hence, there is a group consensus
over ownership. If there is a conflict, the stronger strain wins. Due to the limited size of
islands and population the system costs never became too high to become ineffective.
Although most of the discussion focus the blockchain technology use for virtual currencies (Bitcoin), blockchain can affect many other areas and possible even more meaningful ones than the currencies. Besides logistics and transportation, healthcare or the
energy industry as examples, the technology can have an impact in several areas of the
financial industry. For example:
Clearing and settlement.
Brokerage and financial research activities.
Correspond banking, trade finance, remittance and payments.
Trust and custody functions in asset management.
Smart contracts for automated, self-controlled management of financial contracts.
Distributed storage, authentication, anonymization of private information.
There are different blockchain types or architectures. From a distribution perspective of the ledger among the users centralized, decentralized or distributed topologies are
possible, see Figure 4.31. From an authorization perspective there are permissionless
and permissioned blockchains. In the former, anyone can participate in the verification
process whereas in the latter one, verification nodes are preselected by a central authority
or consortium. These technologies are similar to the traditional wealth and asset management setting which requires to Know Your Client (KYC). Contrary to permissionsless
networks, the actors on the network are named, the intention is that they are also legally
accountable for their activity. The transactions in such networks will be predominantly
so-called off-chain assets - fiat currencies, titles of ownership, digital representation of
securities - whereas in the permisssionless world on-chain assets such as virtual currency
are transacted. Since the number of actors is smaller in permissioned blockchains only a
small number of participants need to operate which makes such networks more scalable
than the permissionless ones but they are also less secure since collusion between the
members can lead to altering or reverting transactions. An example of a permissioned
blockchain is Ripple. Finally, from a consensus perspective, full network consensus
or a restricted form are possible. We summarize:
Different network topologies - centralized, distributed, decentralized;
385
Different consensus mechanism - centralized, network consensus (crypto-economics),

restricted;
Different authorization mechanism - permissionless and permissioned ones;
Read access - unrestricted and restricted;
Transactional access - unrestricted and restricted.
Figure 4.31: Emergence of different network topologies (Celent [2015], UBS [2015]).
Summarizing: :
Switching from single third party trust to distributed ledger trust for transactions.
Unambiguous ownership rights in at any moment in time due to the consensus
mechanism.
Approved data in the distributed ledger cannot be changed - immutable history of
transactions exist.
Since the blockchain is independent of service providers, device manufacturers or
any type of applications it shows persistence.
The blockchain is public.
386
4.11.2
CHAPTER 4.
Cryptography
The blockchain records all past information and is growing over time due to new transactions. We need consensus because anyone can create a block which is added to the
existing blocks. But we want an unique chain where we can be assured that it is free of
fraud, errors or other types of unwanted information. Consensus means a way to decide
which block we should trust.
A main part of the consenus work is called the proof-of-work puzzle or simply, the
proof-of-work, which means that each transaction is checked using algorithm: Users
are constantly asked to run cryptographic algorithms (hashing) to validate transactions.
This is one costly way to reach consensus which became popular in the Bitcoin application. Consensus is proportional to the amount of work done by the so-called miners
which do the proof-of-work. Therefore, the chain with the highest amount of work is
the correct chain. This approach is very inefficient since it requires a lot of energy and
installs the incentives for miners to centralize the hashing power which violates what
we understand under decentralized verification mechanism. But many new innovations
based on blockchain technology do not need this kind of consensus mechanism. Proofof-stake is a different form to reach consensus which is also based on algorithms. But no
mining is needed contrary to the proof-of-work but only verification. Users of the technology are asked to prove ownership over a stake which can be a currency or any other asset.
Cryptography today is the digital science which develops methods for secure communication. Since blockchain is based on communication and storage in a distributed,
public way, the security of ownership is key in this digital world which makes cryptography a central discipline. A main objective in cryptography is to take a message M ,
to encrypt it, to transfer the encrypted message to the receiver of the message which
then decrypts it. If both the sender and receiver use the same private key for encryption
and decryption, one speaks about symmetric-key cryptography. Such a system is
difficult to manage in a secure way if at the beginning no secure communication channel
exists. It also becomes complex if a large number of individuals wants to communicate
where each of them shares a different key with all possible communication partners. Both
difficulties are prominent in blockchains.
The concept of public-key or asymmetric key cryptography is therefore used.
Figure 4.32 shows the different types keys. We use the standard names of Alice and
Bob used in cryptography. The symmetric case shows that the message M = Hello is
encrypted in a sequence of letters and numbers and that both Alice and Bob use the
same private key (here secret key) for encryption and decryption. In the asymmetric
case, Alice has two keys: the public key and the private key. The public key can be
used for encryption by anyone in the network but not for decryption which requires the
private key. Therefore, security in the system depends on how secure are the private keys
stored. Diffie - Hellman propose a mechanism where each party generates a public and
a private key pair. The public key is distributed and then Alice and Bob, which both
Symmetric-key
Asymmetric-key
387
DiffieHellman
key exchange
Figure 4.32: Symmetric key, asymmetric key and Diffie - Hellman key exchange (Source:
Wikipedia [2016]).
possess the public key of the other one, can compute a shared secret.
Example Asymmetric key idea

This example is based on Sullivan (2013) and Corbellini (2015). We consider the
RSA algorithm where the encription key is public while the key that is used to decrypt
data is kept private. Consider the example in 4.32 where Bob wants to send the message
M Hello Alice to Alice. The goal is to convert all letters into a deterministic sequence
number, then these numbers are mapped into a random-locking number (encription)
which can only be mapped back to the original sequence if the private decryption key is
used. Since computers prefer to work with not too large numbers, a maximum function
is used. The private and public key are two numbers larger than zero and smaller than
the maximum number.
To start with assume that the two prime number 13 and 7 are chosen. The maximum
number is the 91 = 7 13. The public key of Alice is the number 5. An algorithm based
on the information in the system of 91 and 5 the public key generates the private key 29
to Alice. How can this be used to convert transmit the letter C in the message Hello
Alice ? First, the letter has to be turned into a number. The UTF-8 schemes attributes
the number 67 to the letter C. Then, the letter C is multiplied 5 times - the public key
388
CHAPTER 4.
- with itself. Since alreads 67 67 > 91, the calculation is done modulo the remainder.
This means,
67 67 = 4489 = 91 49 + 30 .
Therefore, the result after the first multiplication is 30. This is then again multiplied
with 67, which is larger than 91 and applying the same division as above, the result
is 8 (the remainder). This is repeated in total 5 times leading to the number 58 - the
encryption of C = 67 is 58. This is the message Alice receives. Now she uses the private
key number 29 and multiplies the 58 with itself again 29 times where we use the same
logic - after each multiplication we do the next multiplication with the remainder:
58
58}
| {z
= 67
29 times, modulo 19
which is the letter C. If you dont know the private key number 29, then you dont
know how many times you have to multiply 58 with itself in the above time consuming
way to calculate and consider in each step the remainder. Therefore a lot computer
power is then needed to try all different possible values of the private key - besides
easy part of multiplication (encription), decryption is a factoring-type problem which is
harder. Assume that you have to multiply two arbitrary prime numbers p and q - an
easy problem but consider that you know the product p q and you have to find p and
q is much a harder problem.
The above method was not considered to be ideal for future cryptography - elliptic
curve cryptography is one method which has more desirable properties, see Sullivan and
Collabrini for an introduction.
Signatures are important in transactions of assets. A signature confirms or verifies
among others
the willingness of the parties to enter a contract;
that the signing parties owe the assets for exchanges;
that the signing parties have the right to enter into the transaction.
Therefore, a content of contract is personalized by using signatures. Digital signatures
are based on public-key cryptography.
Alice wants to sign electronically a document M . That for, two keys are mathematically generated: A privately known signing key pkA , which is generated at random, and
a publicly known verification key vkA . Given the message M and the private key pkA , a
signing algorithm generates the signature DS. That is, the digital signing dsA function
maps a combination of the message (document) and the private signing key into the
389
output - the digital signature DS - which is a short sequence of numbers:

dsA : M pkA DS .
Only the owner of the private key can generate the digital signature. Changing the message, the digital signature changes too: M M 0 implies dsA (M 0 , pkA ) 6= dsA (M, pkA )
This is particular to digital signatures and does not hold true for physical ones. Finally,
given the message, the public key and the signature, the verification algorithm leads to
acceptance or rejection of the signature.
For the proof-of-work in blockchains and Bitcoins, one has to compare fast and easily
data of arbitrary size and one has to be sure that the message which was digital signed
did not changed. Cryptographic hash functions ( algorithms) are then used.
We consider first hash function in the digital signature case. The hash algorithm or
function ] acts on the message M of any length and produces an output - the hash or
digest - of fixed length. The function is deterministic which means that for the same input always the same hash-output follows. The term cryptographic means that the hash
function needs to satisfy some criteria due to security, authentication or privacy concerns.
First, the time to compute the hash should be short for any message input. Second,
to reconstruct a message given a hash result is impossible unless one tries all possible
combinations. But the space of all possible combinations is that large, that the amount
of time needed to check all combinations is not feasible. Next, changing the message
only by a little amount of information should change the hash value that heavily, that
the new and the old hash look uncorrelated. Finally, it should be a hard problem to find
two different inputs which lead to the same output.
In the digital signature context, the function ] turns documents M of arbitrary length
into fixed length hash outputs .Then, the private key is combined with this hash:
] : M ](M ) = hash, dsA : ](M ) pkA DS .
Verification of the signature means for the three inputs message, digital signature and
public available verification key:

true,
if pkA was used to generate the signature;
vkA (dsA (](M ) pkA )) =
f alse, else.
Since Alice generated this public key, it is related to Alice. Therefore, by knowing the
public key only it is possible to relate the identity of Alice and the signature. This is
exactly what a physical signature should also provide. In the case of physical signature,
DS is replaced by the signature, hashed document ](M ) is the physical document and
the public available verification key is the knowledge distribution how the physical signature of Alice looks like.
390
CHAPTER 4.
Example Bitcoin
Some nodes in the networks - the miners - take the challenge to solve the puzzle
(proof-of-work). The first one which solves the puzzle meaning the verification and
approval of the transaction (the hard computer work), communicates this to the other
miners. If more than 51% of the miners agree, the transaction is added to the blockchain
and the miner who first solved the puzzle gets paid for his or her efforts. Therefore,
to control and manipulate the blockchain, one needs to control more than 51% of the
computer power in the network.
The winning miners reward at present in the Bitcoin network are newly created
coins and/or a transaction fee if the buyer and seller decided to pay such a fee.
The hash function is used for the miners job as follows. Consider two objects - a
challenge string c and a response string or the proof-of-work string p. The miner solves
a hard problem, i.e. he derives p starting from c using cryptographic algorithm. Given
(c, p) a hash function ](c, p) is used with the outcome reads:
](c, p) = |00...00
{z } xxx....xxxx ,
40 zeros
The hash function value starts with 40 times zero as entry and then other numbers x
follow. Suppose that you want to find out p given c without using the miners technology.
Then, there are 240 possibilities for the hash function result for the first 40 numbers.
But there is only 1 element which starts with 40 zeros. This shows the hugh amount
of computation capacity which is necessary to hack the miners work. While it is hard
to find p, it is easy to verify that p is indeed a correct proof-of-work: Simply put c and
the candidate p into the hash function. If the result starts with 40 zero numbers then
the verification proved fast that p is indeed the proof. If more than 40 leading zeros are
required, then more efforts are needed by the miners. On average, it takes 10 minutes
today in the Bitcoin system to find a proof. If a miner found a proof, he will announce
it to the Bitcoin system and as shown above, it easy to verify that proof is correct.
4.11.3
Examples
Example Smart contracts

The concept of smart contracts was invented by Szabo (Szabo (1997)). The
blockchains for smart contracts do not attend to achieve consensus on data streams
(Bitcoins) but to achieve consensus on computation. An examples of a smart contract is
a bitcoin transfer between two agents which is made dependent on some other conditions
391
which extends the possibilities of using transactions.

A different example of a smart contract are term sheets. Suppose that the asset ownership of a structured note is digitalized and made tradeable in a blockchain supported
network. Today a term sheet is a PDF document with zero intelligence in the sense that
the term sheet cannot act itself to any changing circumstances such as changing value of
the underlying or changes due to corporate actions. The term sheets becomes smart if
it has the possibility to detect changing circumstances and to self-enforce changes in the
term sheet content. This means the life cycle management of documentation of trades
will completely automatized and free of any human actions.
Example Blockchain and databases (Peters and Panayi (2015))

What are the differences and advantages of the blockchain technology compared to
traditional databases used to record financial transactions? Depending on the nature of
the data one is storing, there are different types of databases such as document databases
or relational one which are based on a set theory and which are implemented as tables.
Databases differ also in their topology where we focus on distributed databases which
are connected by a compute network which is closest to the blockchain technology.
The distributions objective is to have a better reliability and availability, to improve
performance and to make expansions easier. A user in such a network need not to
know the topology of the database network and the nodes need not all have the same
functionality. How are modifications in the databases propagated to the different nodes
which need the data? It is common that so-called master nodes are first updated which
then propagate the information to their so-called slaves. This raises possible performance
issues for the master nodes and the possibility that data are modified simultaneously at
different nodes. Blockchain technologies can avoid such possible conflicts.
Data security, confidentiality, availability and integrity are key for the functioning of
financial institutions. There are standards which has to hold independent of whether
the databases or blockchain technologies are used. For a general discussion on these
topics we refer to Peters and Panayi (2015) and references therein. We discuss briefly
the Clark-Wilson (CW) model for data integrity from a blockchain perspective. The
CW model partitions all data into two sets termed Constrained Data Items (CDIs) and
Unconstrained Data Items (UDIs). Additionally, there are subjects which can apply
transformation processes to data items to take CDIs from one valid state to another and
there are integrity validation procedures which confirm that all CDIs in a system satisfy
a specified integrity scope. The CW model specifies 6 basic rules that must be adhered
to in order to maintain integrity of a system where we add the comments regarding the
blockchain technology to fulfill the specific rules (we summarize Peters and Panayi (2015)
below).
392
CHAPTER 4.
The application of a transformation process to any CDI must maintain the integrity
of the CDI and it may only be changed by a transformation process. Since any
transformation process is a transaction, and transactions on blockchains are unitary,
it is impossible for one side of the transaction to happen without the other.
The transformation processes on prespecified CDIs must separation of duties: The
certifier of a transaction and the implementer must be different entities. In any
blockchain, subjects (users) are only enabled to transact with the tokens belonging
to them, and no other user is able to access these without knowledge of their private
key. Verifiers (miners) only ascertain whether transactions are valid.
All subjects in the system must be authenticated. This is the case in blockchain
through public key cryptography.
There must be a write only audit file that records all the transaction processes.
Blockchain can even provide guarantee of absence of modification and in the context
of ownership, the blockchain proves that an asset has been transferred to somebody,
and has not been transferred to somebody else subsequently because transactions
can only be found on the blockchain.
It must be possible to upgrade some UDIs to CDIs through the application of a
transaction process.
Only a privileged subject in the system can alter the authorisations of subjects.
In the case of permissioned blockchains, there may be a consortium which may
determine whether another node can enter the network.
Example Trade and settlement process

The process where a buyer and a seller agree to exchange a security (trade execution)
and the date where the trade is settled (assets are exchanged) can be 2 or 3 days
depending on the jurisdiction and the type of asset. A longer period between trade
execution and settlement raises settlement risk - the risk that one leg of the transaction
may be completed but not the other, and counter party risk - one party defaults on its
obligation. Besides the reduction of risk, a decentralized blockchain technology could
also reduce the costs the trade and settlement process.
A standard trade-clearing-settlement process life cycle can be described as follow
(Bliss and Steigerwald [2006]):
Trading.
393
The investors (buyer and seller) who wish to trade contact their trading member
which place their orders on the exchange.
The trades are executed in the exchange or any other platform such as a multilateral
trading facility or an organized trading system.
Clearing.
Clearing members who have access to the clearing house or the central counter
party, which are also trading members, settle the trades.
Clearing and settlement can be bilateral, i.e. settled by the parties to each contract.
The G20 enforces after the GFC to switch from bilateral to central counter party
(CCP) clearing for the OTC derivatives. A CCP acts as a counterparty for the
two parties in the contract. This simplifies the risk management process, as firms
now have a single counterparty to their transactions. Through a process termed
novation, the CCP enters into bilateral contracts with the two counterparties, and
these contract essentially replace what would have been a single contract in the
bilateral clearing case. This also leads to contract standardisation and there is a
general reduction in risk capital required due to multilateral netting of cash and
fungible securities. Therefore, CCP means that the bilateral clearing topology is
transformed into a centralized or star shaped one. From a systemic risk perspective,
while the more risky bilateral connections are replaced by less risky centralized ones
the major risk concentration is now located in the few CCPs.
Settlement.
The two custodians, who are responsible for safeguarding the assets, exchange the
assets where a typical instruction is delivery versus payment: Delivery of the
assets will only occur if the associated payment occurs.
Using a blockchain means to transform the centralized CCP topology back into a decentralized one where there is no need for an CCP. In the trading-clearing-settlement cycle,
a consortium blockchain can be used as follow to satisfy the present standards. On the
trading level, a consortium of brokers can set up a distributed exchange, where each of
them operate a node to validate transactions. The investors still trade through a broker,
but the exchange fees can be drastically reduced. On the clearing level, a consortium of
clearing members can set up a distributed clearing house, thus eliminating the need for a
CCP. Contrary to bilateral clearing, the contract stipulations are administered through
a smart contract which reduces risk management issues. If the securities and money are
digitalized, settlement does not need any custodians with securities depositories but the
assets are part of the permissioned blockchain.
394
CHAPTER 4.
Example Land register management

Alice is the owner of a piece of land and Bob wants to buy the land from Alice,
we follow Cuche-Curti et al. (2016). This is a little digitalized transaction in most
jurisdictions. Typically, Alice and Bob meet physically in the register office. We sketch
how using a blockchain the transaction can be digitalized. Alice starts the process with
the messaging by creating the message M . This means that she uses the internet to
describe digitalized the necessary information which are needed for the transaction.
Besides information about the land itself and she has to add information about the
previous transaction when Alice bought the land. Both Bob and Alice also generate
their public and private keys.
When Alice completed the creation of the message M , she uses the digital signature,
this means she encrypts M using her private key, to prove that she is the admissible
sender of the transaction message. Since her public key is known to everybody and in
particular to Bob, he is able to verify the transaction - the piece of land is uniquely
linked to the owner Alice.
The next step is broadcasting the message to the network where this can be done
to the full network or to a segmented part of the network first, which are the peers of Alice.
The transaction of Alice is added to other transaction messages which are grouped
into a block. Then the miners start with the proof-of-work. Since they cannot be
compensated as for Bitcoins with new coins, the winning miner will rewarded by a fee
which Alice and Bob pay to him or her. This miner will broadcast the proof-of-work and
if the majority of the miners verified the proof, the transaction is confirmed and added
as part of the block as the most recent block to the blockchain. This procedure then
irrevocably stores the change of land asset ownership from Alice to Bob.
4.11.4
Different Currencies
We compare three types of currencies - physical, digital and crypto-currencies.

Which features should an object possess such that it is called a currency or money?
Something is considered to be money if there are satisfactory answers to three questions:
To which extend the potential currency stores value, how can it be used as a medium
for exchange of goods and services and finally, can it be considered as an unit of account.
In 2016, relatively few people use Bitcoins or any other crypto-currencies. The volatility of Bitcoins is often several times larger than the volatility of central-bank controlled
currencies. The Bank of England (2014) states that volatility of Bitcoins is 17 times
larger than the volatility of the British pound: The use of Bitcoins as a short-term
storage medium is questionable although nothing can be inferred about its value as a
long-term storage medium. The number of transactions of retail clients is used to mea-
395
sure their willingness to accept Bitcoins as a medium of payment. Since this number is
not observable, proxy variables are used instead such as data from My Wallet, see Bank
of England (2014). The analysis shows that the number of transactions per wallet is
decreasing since 2012 to a value 0.02 transactions per wallet. Most clients buy-and-hold
their Bitcoins instead of using them. Finally, there is little evidence that Bitcoins are
used as units of account since.
A crypo-currency combines two main components:
A new currency such as Bitcoins.
A new decentralized payment system - the blockchain.
Example Bitcoin value volatility
By 2011, 10 USD was worth 1 Bitcoin. In 2013, the exchange rate was up to 266
USD for one Bitcoin. Shortly after the high, the exchange rate dropped by 80 percent.
In November 2013 the exchange rate was 1200 USD/Bitcoin. After the default of the
platform Mt. Gox, the rate dropped to a value of 340 USD/Bitcoin.
Consider Alice who wants to buy a cup of coffee at Bobs coffee shop worth USD 1.5.
We rely heavily on Antonopoulos (2015) for the Bitcoin explanations in the following.
Alice could use either physical money, digital money or the crypto currency Bitcoins. We
compare these three different schemes.
Where does Alice gets the currency to buy the book? For physical or digital money,
the answer is clear. But Bitcoins can in 2016 not be bought at a bank for example. While
there are some exchanges, most retailers get their first Bitcoins from a friend. To get
her first coins, Alice needs an internet access and a friend where she can exchange USD
against Bitcoins.
Who generates the money which Alice wants to spend? Physical money is generated by cental banks, digital money by commercial banks and Bitcoins are generated
following strict rules - the miners. Commercial banks generate money by the creation of
loans since each loans creates a deposit position on the loan borrowers bank account.
Therefore, physical money are a liabilities of the central banks against the holder of the
money and digital money represent a claim against commercial banks. Both the central
bank and commercial bank can increase or decrease the amount of money without any a
priori limitations. But Bitcoins are different. First, total supply of Bitcoins is limited to
the creation of 21 million Bitcoins. Given the rule-based creation process, this amount
will be reached around 2040. Given this fixed supply side and its diminishing rate of
productions, Bitcoins are seen as a deflationary currency. Furthermore, Bitcoins do not
specify a claim on somebody - there is no such thing as a central counter party (central
396
CHAPTER 4.
bank, commercial bank) since using the blockchain technology, Bitcoin payments are
made directly between the payer and the payee (peer-to-peer) and they are anonymous
like the use of banknotes since there is no need for the two parties to disclose their wealth
amount hold in the crypto-currency. Given that Bitcoins are not a claim to anybody,
some regulators considers them to be a commodity instead of a currency system. But
Bitcoins differ from normal physical commodities such as oil. The value of oil is driven
by the actual physical demand and supply and by the expectations about the future
demand and supply. The demand and supply for Bitcoins depends fully on how participants agree that Bitcoin has a meaning.
How can Alice and Bob trust that the money used is not counterfeit and how can Bob
be sure that no one else will claim that the money Alice used to pay the coffee in fact
belongs to this third party (the double-spending problem). If Alice uses physical dollars,
there is no double-spend problem - goods and cash are exchanged between buyers and
sellers. Due to the immediate settlement, double spending is not possible and there is
no need for verification by a third party. Such bilateral or peer-to-peer cash transactions
offer limited opportunities and face large transaction costs. Since central banks issue
physical money they use strong effort to generate money which is difficult falsify or imitate. If Alice uses a digital payment say using a credit card, the coffee is sold from Bob
to Alice but the payment is done digital via central parties - banks - from the buyer to
the seller. The bank verifies that there is no double spending. Basically, the bank checks
every transaction and Alice and Bob trust that payments via the bank are not double
spent.
But there are several concerns with the central party structure. From a risk perspective such banks can become systemic relevant or their infrastructure can be exposed to
IT-security risks which is real risk nowadays. From a client perspective the bank can
tend to seek for too high rents. Furthermore, governments can use their power over the
banks to enforce actions against market participants. The U.S. for example relates the
access of foreign banks to the vital U.S. markets to the extend the banks corporate with
government bodies. Else, the threat is to freeze accounts. Another type of legal risk are
confiscatory taxes such as observed in Cyprus in 2011 during the EU government crisis.
Finally, Alice could use Bitcoins to pay her coffee. Contrary to the ledger used by
the banks which are not public and Alice and Bob trust that banks ensure the validity
of payments, the ledger for crypto-currencies (the blockchain) is public: A distributed
peer-to-peer system - verifies whether the transaction between Alice and Bo is acceptable
or not, that is that there is no double-spending problem?
A user [Alice], wishing to make a payment, issues payment instructions which are
disseminated across the network of other users. Standard cryptographic techniques make
it possible for users to verify that the transaction is valid - that the would-be payer owns
the currency in question. Special users in the network, known as miners, gather together
397
blocks of transactions and compete to verify them. In return for this service, miners that
successfully verify a block of transactions receive both an allocation of newly created currency and any transaction fees offered voluntarily by parties to the transactions under
question. When blocks of transactions are verified, they are added to the ledger (the
blockchain). Bank of England (2014).
Therefore, the incentive for the proof-of-work done by miners is to compensate them
for solving hard mathematical problems but where the verification of the solution is simple.
Which payment type is the cheapest one? One would expect that physical payment
is the most expensive one and using Bitcoins is the cheapest one. But Bitcoins are 2016
only cheaper than those in centralized system since the miners in the crypto-currency
system receive as a subsidy new currency coins for their proof-of-work efforts. Would
they charge the production costs for this work, higher fees than for physical or digital
currency payments due to the increasing computer costs to scale. Since the production of
new Bitcoins is decreasing over the next decades, the effect of subsidies will also fall and
one can therefore expect that the costs for Bitcoins will increase. Figure 4.33 shows that
the transaction clearing volume is limited but the computing efforts are increasing. This
raises the question about the evolution of transaction costs. Comparing the number of
daily Bitcoin transactions - around 1000 000 by the end of 2015 (Coinometrics, Capgemini)
- with the number of daily transactions by Visa (212 mio.), MasterCard (93 mio.) and
all other traditional entities together summing up to 340 mio. - the Bitcoin percentage
is 0.03% of this total transaction volume.
From a global payment system perspective there are significant reductions of costs.
BI Intelligence and the World Bank (2014) estimate the following cost saving potential
for blockchains if Bitcoin would be adopted as a global payment system. The fees for
payment cards in the current system of USD 300 bn would be reduced to USD 120 bn,
the fees in the B2C e-commerce would drop from USD 37 bn to 12 bn and remittance
fees would fall from USD 47 bn to USD 5 bn. Comparing the inflow of venture capital
in Bitcoin technology in 2014 with the inflow in the internet in 1995 we observe a higher
inflow in Bitcoin than in the 90s in the internet.3
Although we focus on Bitcoin, there is in fact an inflation of crypto-currencies. Coinmarketcap.com reports that by September 2015 there were 676 listed crypo-currencies
but with Bitcoin consolidating 85% of market capitalization and number two Ripple following with 6%. The tenth largest entity - Bytecoin - represented a market capitalization
of only 0.2%.
2014 Bitcoin: USD 362 mn, 1995 internet: USD 250; 2015 projected Bitcoin USD 786 mn, 1996
internet: USD 639 mn.
398
CHAPTER 4.
Figure 4.33: Bitcoin increase of computing efforts and limitation of clearing volume
(Blockchain.infor [2015]).
4.11.5
Bitcoin
The text follows Antonopoulos (2015), Aste (2016), Khan Academy (2016) and Tasca
(2016). For an economic review see Bank of England (2014). The term Bitcoin represents different objects. First, a crypto-currency. This means a unit of a Bitcoin is
used to store and transmits values between individuals belief in this currency. Second, a
communication medium. All individuals using or creating Bitcoins communicate by the
Bitcoin protocol via the internet.
The main properties of Bitcoins are:
Peer-to-peer virtual cash that does not need third party authority and anyone can
use it.
All transaction are kept in a single ledger. The ledger is replicated and distributed
to all nodes.
Node in the system represent the participants. Each node has a ledger replica.
Synchronization of the ledger follows by periodically verifying and approving blocks
of new transactions (miners work).
Bitcoins are protected by the private keys - only the owner of this key can spend
the coins.
399
The block chain is the chronological list of all blocks of transactions.

There are different role in the Bitcoin network such as miners or retail users.
Alice, a retail user, needs to download a free available software to get started such as
Multibit. She chooses a profile which is suitable for here needs. As for physical money,
(virtual) wallets exist. Since there are no physical coins, the value lies in the value
transfer in the transactions between the buyer and seller. Alice has to prove ownership
in here transaction with Bob when she pays the coffees. This ownership is proven by
using keys which unlock the value Alice spends for the coffee and which is transferred to
Bob. These keys are kept in the wallet and it is protected by a password - if a hacker
is able to uncover Alices password then he can steal her keys and transfer here Bitcoins
immediately to an arbitrary address in the network. As a result, Alice receives a wallet
and a Bitcoin address from Multibit. This address, which is a long string of numbers
and letters, can be shared with other Bitcoin users and in order to avoid to remember or
type this long string, an equivalent QR code can be scanned by Alice. The private and
public key pairs enable people to encrypt information to transmit to each other. The
receiving party is able to determine whether the message actually originated from the
right person, and whether it had been tampered with. Hence, only the private key owner
can sign corresponding transactions but anyone can observe and verify transactions by
anyone else, since it only requires public key. These properties are basic if one needs to
communicate to a network that a transaction between two parties has been agreed.
As we stated in last section, Alice buys here Bitcoins from a friend - Joe. She gives
him USD 10 in cash and Joe transfers the respective amount in Bitcoins to Alice. He
uses one of the many websites to find out the USD-Bitcoin exchange rate of USD 100
corresponding to BTC 1 where BTC represents Bitcoins, see Figure 4.34. Hence, he
transfers Bitcoin 0.1 to Alice. To do this, Joe opens his wallet application where he
uses the address of Alice - the QR code avoids that Joe has to type the long string of
numbers and letters - and enters BTC 0.1 and he chooses to pay a fee of BTC 0.0005
as a compensation for the proof-of-work which is done in the network to confirm the
transaction. Alice receives a confirmation with status unconfirmed and at the same
time the transaction propagates through the peer-to-peer protocol (blockchain). Since
the transaction sent contains all is needed to confirm the ownership, it is irrelevant where
geographically or when the information is sent to the network.
More precisely, both Joe and Alice use the public key to verify the digital signature
and a private key, see Figure 4.34. The figure shows that there is a physical world and
a virtual or Bitcoin system world. The physical persons Joe and Alice are in the virtual
world represented by their two keys which are a sequence of numbers and letters.
The Bitcoin transaction starts with the Bitcoin amounts which Joe possess. He received 0.05 BTC from C and 0.055 from D in a past transaction. These transactions
are verified and part of the blockchain. These two transactions form the basis for the
input of the transaction between Joe and Alice. Instead of using the whole information
400
CHAPTER 4.
$ 10
Alice
Joe
Physical
world
0.15
Public Key J
Private Key J
Public Key A
Private Key A
Previous Transactions
0.05 from C
DCDC
0.055 from D
DDDD
Digital
Signature
Input
Public Key A
Amount BTC 0.1
Digital Signature Alice
Fee 0.005
Bitcoin
system
Output
Bitcoin
Transaction
Proof-of-work
(Double Spending)
Miner
Decentralized Peer-to-Peer
Network
Figure 4.34: Bitcoin transaction; top level description. (Source: Adapted from Khan
Academy (2016)).
of these two transaction, a digest DC or DD is used. Using the hash function, see below,
anybody in the Bitcoin system can verify that by applying this function on the digest
that Joe is indeed the owner of the Bitcoins. Given the inputs, the digital signature
connects this input with the output. In the output the public key of Alice is included for
verifications, the amount, the possible fee paid to the miners for the proof-of-work efforts
and the digital signature of Alice. These three parts from the Bitcoin transaction which
remains to be approved by the Bitcoin system in the next. The first step is that the
transaction is spread out to the whole system - the decentralized peer-to-peer network.
The miners which check out that there is no double spending possess a different function
than say other participants which use the network for their payments such as Alice.
A different visualization of the steps of the transaction shown in Figure 4.35:
Buyer of the good wants to send money to the seller. Each transaction contains
at least one address as input, one address as output, for each of the addresses
the appropriate amount and other fields for the signature and management. The
entire transaction is signed with the private key of the sender. This authenticates
the transaction and protects it against changes. The private key consists of 51
characters. This key can be used to generate automatically the public key. The
keys can be store anywhere but there is no possibility to derive the private key
using the public key. If the private key is lost or forgotten, then the access to the
Bitcoin accounts is lost.
401
The whole transaction is represented online as a block.

The block is broadcast to every party in the network with a flooding algorithm:
The sender sends its transaction to all known Bitcoin Cores in the network.
They verify the signature, i.e. that the transaction is valid.
They then also direct the transaction to them known Bitcoin Scores, that is the
block is added to the chain (mining).
The money moves from B to A.
Figure 4.35: Bitcoin transaction. First, the buyer signs the transaction with her private
key. The transaction then gets communicated to peers. The peers verify the transaction
signature using the buyers public key. A new block is then added to the blockchain and
communicates the blockchain to the peers (Source: Berentsen and Schr (2014)).
Considering the mining process, Antonopoulos (2015) states that The mining process
[of blocks] serves two purposes in Bitcoin:
Mining creates new Bitcoins in each block, almost like a central bank printing new
money. The amount of Bitcoin created per block is fixed and diminishes with time.
Mining creates trust by ensuring that transactions are only confirmed if enough
computational power was devoted to the block that contains them. More blocks
mean more computation, which means more trust.
402
CHAPTER 4.
A good way to describe mining is like a giant competitive game of sudoku that resets
every time someone finds a solution and whose difficulty automatically adjusts so that it
takes approximately 10 minutes to find a solution. Imagine a giant sudoku puzzle, several
thousand rows and columns in size. If I show you a completed puzzle you can verify it
quite quickly. However, if the puzzle has a few squares filled and the rest are empty, it
takes a lot of work to solve! The difficulty of the sudoku can be adjusted by changing
its size (more or fewer rows and columns), but it can still be verified quite easily even
if it is very large. The "puzzle" used in bitcoin is based on a cryptographic hash and
exhibits similar characteristics: it is asymmetrically hard to solve but easy to verify, and
its difficulty can be adjusted.
In 2016 the most active miners are located in China who cover around 50% of the
total market share (Tasca (2016)), followed by Europe with around 25%. This is also
reflected in the traded currency pairs. The traded volume CNY/BTC is about three
times larger than the USD/BTC one. This dominance of Chinese activity can also be
observed in the number of active Bitcoin clients normalized by the number of users which
have direct access to the internet: The number in China is around 5 times larger than the
second largest numbers of the US or Russia. Bitcoin startups raised around USD 1 bn
in the three years 2012 2015 with an annual growth rate of 150%. This rate dominates
other startup rates such as crowdfunding, lending or banking in general by factor 2 3.
All transactions are recorded in the blockchain. In the transaction between Alice and
Bob, all other nodes in the network receive an encrypted record of the transaction. When
a majority of the nodes agree to accept a transaction, then the block where the transaction belongs too is added to the blockchain. More precisely, several miners pairwise
hash all transactions - two transactions are hashed into one transaction - such that the
all transactions are taken together into on so-called block. If the proof-of-work for the
whole block is completed, the block is added to the sequence of all past blocks - the block
chain. A Bitcoin transaction cannot be revoked after it has been confirmed by the network and added to the blockchain. The blockchain is redundant and stored locally on all
cores Bitcoin and managed and updated via the Bitcoin network. The only requirement
for participation is to operate a Bitcoin Core which is compliant with the Bitcoin protocol.
The blockchain mechanics makes it difficult to fraud. Suppose that Joe wants to
double spend the money, i.e. he wants to use the same Bitcoins for Alice and say to buy
food. He then first has to add this fraudulent transaction to the Bitcoin network, where
it is added by miners to a different block than the block which contains Alices transaction. Second, he needs to assure that the proof-of-work for this fraudulent block is done
before the non-fraudulent block is verified and added to the block chain. Finally, since
proof-of-work for blocks of transactions which do not contain Alices and the fraudulent
transaction starts in parallel when the block with Alices and the fraudulent transaction
are checked, Joe has also to deliver the proof-of-work for these other new blocks which
results in an impossible task. We refer to Antonopoulos (2015) for more details about
403
how such a transaction works.
4.11.6
Future of Blockchain and Bitcoins
The following arguments lead to the often heard opinion:

Fact 4.11.2. Blockchain probably yes; Bitcoin no.
While the Bitcoin hype cycle has gone quiet, Silicon Valley and Wall Street are betting
that the underlying technology behind it, the Blockchain, can change... Goldman Sachs
(December 2015)
One sometimes hears that the internet revolutionized the information exchange
and blockchain will revolutionize the value exchange. This comparison should be done
carefully. First, when the internet was invented the exchange of information was difficult,
time consuming and not scalable. Therefore, there was a strong demand from all types
information providers - firms, scientists, private persons, etc. - to use this new technology. The exchange of information was often not related to the question of ownership every scientist or every investment firm who posts information in the internet wants that
the information be disseminated.
For the blockchain the situation is difficult and we summarize: Ownership of assets
for the value exchange is key, blockchain technology will not replace non-existing value
exchange mechanism but often well-established structures owned by exchanges, central
banks or other financial intermediaries. Therefore, there is a fundamental conflict where
on one hand side the blockchain wants to break monopolistic or oligopolistic structures
which on the other hand side will be defended by powerful organizations.
We consider bitcoins. First, the limitation of total supply of Bitcoins defines a deflationary currency. Second, there is mining concentration. The mining industry is an
oligopoly where the market share of the ten largest miners is between 70% 80% by the
end of 2015 (Tasca (2016). This raises security concerns since to gain 51% consensus
about a block transaction verification becomes more risky the less miners contribute to
the majority value. Third, the type of business categories using Bitcoin can move bach
to sin categories such as online gambling or black market: Tasca (2016) reports that
in 2012 the relative income for black market and online gambling had a share in the
Bitcoin income flow of around 70%. This number collapsed in the last two years to less
than 10%. Finally the cost of proof work is considerable. Aste (2016) estimates that
to keep a capital of around USD 10 bn secure in the blockchain annual costs of 10%
are needed. The reason is the number of hashes which are generated every second for
the proof-of-work of 1 bn times 1 bn. In 2016, the Bitcoin network costs USD 2 5 per
transaction. Is this profitable? The author estimates that the break even point for a USD
1 mn block transaction is at USD 1000 000. These costs make it too costly to attack the
proof-of-work. But the price for one transaction is expensive compared to other payment
systems. Furthermore, the proof-of-work mechanics consumes a lot of physical energy.
404
CHAPTER 4.
One can easily estimate that only a few networks such as that one for Bitcoin can be
added in the world before touching the limits of energy consumption.
4.11.7
Alternative Ledgers - Corda
We consider a variant of the so far described blockchain idea - Corda (Brown et. al
(2016)).
Consider banks (the nodes) which search for a technology to record and enforce
financial contracts such as cash, derivative or any other type of products. More precisely,
the banks want to record and manage the initiation and the life cycle of financial contracts
between two or more parties which is grounded in the legal documentation of the contracts
and which is compatible with the existing emerging regulation in an
efficient way: duplications and reconciliations of transactions are not necessary.
open way: every regulated institution can use the technology.
appropriate privacy/public mix way: consensus about transactions is reached on a
smaller than full ledger level.
These requirements lead to the solution Corda, which is used in by the blockchain
company R3 leading itself several dozens of major financial institutions, which differs in
some respects with the above general blockchain and particular Bitcoin discussion.
First, there are no miners and there is no proof-of-work since no currency needs to
be generated (mining) and due to the mixed private/public association of information
no general consensus on the ledger is needed. The advantages are avoidance of costly
mining activities, of a deflationary currency and of a concentration of the mining capabilities in a few nodes. Second, Bitcoins can only contain a smaller amount of data due
to the fixed length data format. This is not useful if one considers all economic, legal and
regulatory information in an interest rate swap between two parties. Corda encodes the
information of arbitrary complex financial contracts in a contract code - the prosa of the
allowable operations defined in term sheets is encoded. Corda call this code state objects. Consider a cash payment from bank A to a company C. The state object contains
the legal text describing the issuer, the date, the currency, the recipient etc. and the
codification of the information. This state is then transformed into a true transaction if
the bank digitally signs the transaction and if it verified, that the state object is not used
by another transaction. Hence, there are two type of consensus mechanics. First, one
has to validate the transaction by running the code in the state object to see whether
it is successful and to check all required signatures. This consensus is carried out only
by the parties engaged in the transaction. In other words, teh state object is a digital
document which records all information of an agreement between two or more parties.
Second, parties need to be sure that the transaction under consideration is unique. This
consensus which checks the whole existing ledger is done by an independent third party.
4.12. TRENDS - DEMOGRAPHY AND PENSION FUNDS
405
Summarizing, the ledger is not globally visible to all node. The state objects in the ledger
are immutable in the same way as we described it for blockchains. Given that not all
data is visible to all banks, strong cryptographic hashes are used to identify the different
banks and the data.
Why are the leading banks pushing this system? They can all use only one ledger
which makes reconciliation and error fixing in todays individual ledgers at topic of the
past. Furthermore, the single ledger does not change the competitive power of the banks
in the ledger. The economic rationale, profit and risks to enter into a swap remain within
UBS and Goldman Sachs but the costs and operational risks of the infrastructure are
reduced due to the collaboration to maintain shared records. In other words, while the
banks keep the profit and loss from their banking transactions unchanged to the present
competitive situation, they reduce the technology cost part by cooperation.
4.12
Trends - Demography and Pension Funds
We already considered parts of the topics demography, retirement provision and pension
systems. Before we continue to discuss these topics also from an asset management
perspective I remark that asset management is only a important tool for the solution of
the problems in the different retirement pillars which many countries face. Necessary for
the change of the different systems are deep political reforms which will restore the trust
of the populations in the retirement systems.
4.12.1
Demographic Facts
Not so long ago, in the years following World War II, the world was preoccupied with
population growth. Though population explosion is no longer the burning issue it once
was, we are still experiencing staggering population growth of 2 to 3 percent per annum.
Population pressure will of course mean a growing likelihood of mass emigration to other
parts of the world; in particular if those countries with strong population growth are hit
by the effects of climate change or war.
The economically most advanced societies face another population problem. Each future generation will be smaller than the one that preceded it. For some, this has already
become a matter of national survival. Triggered by low fertility rates, this phenomenon
is gaining ground worldwide: 46 percent of the worlds population has fallen into a lowfertility regime. There is nothing to indicate that this rate is going to recover. Magnus
(2013) states that (i) the ratio of children to older citizens stands at about 3 : 1 but is
declining. By around 2040, there will be more older citizens than children. By 2050, there
will be twice as many older citizens as there are children, (ii) the number of over-60s in
the rich world is predicted to rise by 2.5 times by 2050 to 418 million, but the trajectory
starts to level off in about 20 years time. Within this cohort, the number of people aged
over 80 will rise six times to about a 120 million and (iii) in the emerging and developing
406
CHAPTER 4.
worlds, the number of over-60s will grow by more than seven times to over 1.5 billion by
2050, and behind this, you can see a 17-fold increase in the expected population of those
aged over 80, to about 262 million. Magnus (2013)
Malthus (1798) were the first to study the interdependence between economic growth
and population growth. He assumed that as long as there was enough to eat, people
would continue to produce children.
Since this would lead to population growth rates in excess of the growth in the food
supply, people would be pushed down to the subsistence level. According to Malthuss
theory, sustained growth in per capita incomes was not possible; population growth
would always catch up with increases in production and push per capita incomes
down. Of course, today we know that Malthus was wrong, at least as far as the now
industrialized countries are concerned. Still, his theory was an accurate description of
population dynamics before the industrial revolution, and in many countries it seems
to apply even today. Malthus lived in England just before the demographic transition
took place. The very first stages of industrialization were accompanied by rapid population growth, and only with some lag did the fertility rates start to decline. Doepke (2012).
Hence, for Malthus children were a normal good. When income went up more
children were consumed by parents. Using a micro economic model, see the exercises,
the equilibrium supports the above intuition: An increase in productivity causes a rise
in the population, but only until the wage is driven back down to its steady- state level.
Even sustained growth in productivity will not raise per capita incomes. The population
size will catch up with technological progress and put downward pressure on per capita
incomes. This model explains the relationship between population and out-put for almost
all of history, and it still applies to large parts of the world today. Doepke (2012).
Since fertility rates decreased in Europe in the nineteenth century, per capita could
grow. What are the causes for this growth? We consider time-cost factor of raising
children. In the Malthus model, all labor is of equal quality. In modern economies,
human capital has two components. Innate human capital that is possessed by every
worker, regardless of education. In addition, people can acquire extra human capital
through education by their parents. Further new feature are that parents must invest
their time and not goods to raise children. As a result, the growth rate of the population
is not constant. It depends on the human capital of the parents. The lower their human
capital, the higher the number of children. Contrary, if human capital is high, fertility
falls. Two factor drive this outcome. An increasing human capital means an increase
of the value of time. Then, the education of children becomes very costly and hence,
parents decide to have less of them. The second reason is that people with high human
capital prefer quality over quantity since there are better at teaching children and this
makes it more attractive for them to invest few children.
407
In developed, Western countries, persistent sub-replacement fertility levels, ageing,

and immigration are recognized as the three major population policy issues. Subreplacement fertility and immigration, in particular, are areas in which effective policies
are hard to come by. The debate, May (2012), is marred by controversy and passion
and discussions on policy issues are polarized. Policy actors seem to be torn between a
laissez-faire attitude and increasing immigration. Increasing immigration has two serious
limitations. First, the level of immigration cannot grow arbitrarily high without generating political tensions. Second, it is becoming increasingly difficult to find the kind
of migrants one wishes to attract since more and more countries are striving to attract
highly skilled migrants. Japan, South Korea, and Taiwan populations are shrinking.
Yet they still resist immigration. They choose automation as a response to dwindling
manpower. In Western democracies, immigration has become an ideology to the extent
that any rational discussion thereof is barely possible. While any forecasts regarding
personal longevity are uncertain, in the last 150 years women have seen their average life
expectancies increase at a rate of three months each year. All those who have forecast
that growth in personal longevity will come to a standstill have been proved wrong. But
there are currently two factors that could well put a stop to growth in average longevity:
the rapid growth of so-called lifestyle illnesses and increasing medical care costs. The
breakdown of the Soviet Union showed that once medical care fails to maintain its level
of quality for the whole population, that populations life expectancy quickly falls significantly.
The speed of ageing is different for different countries; see Magnus (2013). In France,
for example, it took 100 years for the proportion of the population over 60 years old to
double from 7 percent to 14 percent. The pace in emerging markets is much different.
For Indonesia, Brazil, China, or India, the time taken for this proportion to double is
only around 20 years. That is, the speed of ageing is rising rapidly in emerging economies.
But ageing in developed countries occurs in parallel with better health, more extensive education, and related societal changes. We are not just living longer, we are
slower to age. We spend longer in education; we travel more before permanently joining
the workforce; we start families later. We dont think of ourselves as being as old as
previous generations would have at the same age. The effect of all these changes taken
together is not that society is ageing, but that it is getting younger. Finally, a society
with a predominantly young population has a different productivity level than a more
aged population. Syl and Galenson show that 40 percent of productivity increases are
down to young people who enter new markets. These young people break with tradition
and manifest new ways of thinking. Google and Facebook are two prominent examples.
Older individuals possess more experience and wisdom. But Syl and Galenson state that
this only gradually changes productivity.
To manage the emerging demographic regime, innovative policies and new ways of thinking about population are called for. Romaniuk (2012). This change in the structure of
society will have many consequences. One of the most significant will be a labor short-
408
CHAPTER 4.
age. If societies are going to maintain their standard of living, they are going to have
to avoid any reduction in the workforce as a proportion of the total population. At the
same time, many people are going to reach retirement age and realize that they do not
have enough income to maintain what they feel is an acceptable standard of living. The
combination of these two issues will put a lot of pressure on our current views on the
relationship between working and retirement. Employment and retirement laws designed
for a young and growing population no longer suit populations that are predominantly
old but healthy and capable of being productive, all the more so in a work environment
of automated technology. Prevailing family assistance policies are equally antiquated.
Though the maternity instinct may still be present as it always was, womens conditions
have radically changed. The women of today in developed countries, and throughout
the modernizing world, are faced with many deterrents to maternity (e.g., widespread
celibacy, marital instability, financial insecurity) on the one hand, and with many fulfilling, financially well-rewarded opportunities on the other. So much that they are left
with little incentive to trade the latter for the uncertainties of motherhood.
It is easier to bring population down than to make it up, writes John May (2012).
And that is why - in order to escape the sub-replacement fertility trap and to bring the
fertility rate to, and sustain it at, even a generational replacement level, Romaniuk (2012)
- we need to bring to bear meaningful financial and social rewards for maternity. The
current family allowance and other welfare-type assistance to families cannot do this.
Societies under a demographic maturity regime may need to have in place permanent,
life-sustaining mechanisms to prevent fertility from sliding ever lower. Instead we need
a more balanced resource allocation between production and reproduction.
Impact on Retirement Systems
With such demographic development, it will not be possible to meet the promises of the
three pillars of social welfare in many countries. This will lead to more saving behavior on an individual basis and solidarity between generations (the first pillar) will come
under stress. In order for the retirement system not to collapse, the state will have to
define reforms. Will it save the first pillar - that is, it will secure the minimum necessary
standard of living for all? How will the second and third pillars be changed or will they
disappear? As a result, people will individually save more - because they have to and
because confidence in the social welfare system will not increase.
The Melbourne Mercer Global Pension Index report (MMGPI [2015]) from the Australian Centre for Financial Studies and Mercer compared the status of the retirement
systems of 25 countries. The index is based on the following construction; see Figure
4.36.
Although it is called a pension index, it allows one to consider the entire retirement
systems of the different countries. Figure 4.37 summarizes the results for the 25 countries
surveyed.
409
Benefits
Savings
Tax Support
Benefit Design
Growth Assets
Coverage
Total Assets
Contributions
Demography
Government Debt
Regulation
Governance
Protection
Communication
Costs
Adequacy
Sustainability
Integrity
40%
35%
25%
Melbourne Mercer Global Pension Index
Figure 4.36: The Melbourne Mercer Global Pension Index (GMMPI [2015]).
Grade
Index Value
Countries
Description
>80
DK, NL
Robust retirement system that

delivers good benefits, is
sustainable and has a high level
of integrity
B+
75-80
AU
65-75
S, CH, Finnland, CA, Chile,

UK
C+
60-65
Singapore, D, Ireland
50-60
F, USA, Poland, SA, BR, A, I,

Mexico
35-50
Indonesia, China, J, South

Korea, India
<35
Nil
Compared to A-grade it has some

areas for improvement
It has some good features but also
has major risks and/or
shortcomings. Without adressing
them efficacy and/or
sustainability can be questionned.
A system with major weaknessses
and omissions that need to be
adressed.
A poor or non-existing system
Figure 4.37: Summary for the 25 countries in the Melbourne Mercer Global Pension
Index as of 2015 (Adapted from GMMPI [2015]).
410
CHAPTER 4.
4.12.2
Pension Funds
The pension fund assets in the OECD member countries encompassed USD 23 trillion
in 2014. The collision between demographics and the strong reliance on pay-as-you-go
systems in developed countries requires resolution; if not, these problems can be expected
to spread to the rest of the world. There are a number of ways of approaching this,
including (Walter (2007)):
Raising mandatory social charges on employees to cover increasing pension obligations. This is very problematic due to the inverse demographic pyramid and
becomes even more difficult to implement in countries where individuals already
face a high tax burden.
Cutting retirement benefits. Limiting the growth of pension expenditures to the
projected rate of economic growth starting in 2015 reduces the income-replacement
rate from 45 percent to 30 percent over a period of 15 years. Walter (2007). This
would push retired people with low personal saving resources into poverty.
Increasing the retirement age. For countries with a high unemployment rate this
is not a feasible alternative.
Reforming the systems away from pay-as-you-go toward defined-contributions or
defined-benefit pension plans. This is a possibility, and would create a huge demand
for professional asset management.
Keeping the pay-as-you-go systems and reducing the contribution to the pension
funds. These changes impact asset management. The demographic problems in developed countries and the difficulties in finding structural solutions will force pension funds
to increase their investment performance.
The asset allocation of pension fund assets differs significantly between countries.
The exposure to growth assets (including equities and property) varies and ranges from
less than 10 percent, in India, Korea, and Singapore, to about 70 percent in Australia,
South Africa, the UK, the US, and Switzerland. GlobalPensionIndex (2015). The more
growth assets are included in the asset allocation, the larger are the risks: there were
significant declines in the value of assets in 2010 and 2011 reflecting the consequences of
the global financial crisis of 2007 and 2008. However, since that time there has been a
steady recovery in the level of pension assets in each country surveyed as equity markets
have recovered. GlobalPensionIndex (2015).
4.12.3
Role of Asset Management
Asset management can support the pension system in three respects: First, asset management could become more efficient - this means to save costs. Second, asset management
could expand the range of solutions - the investment strategies and finally, asset management could expand the investment opportunity set - the assets.
411
The expansion of investment strategies means to apply factor investing for example.
All pros and cons of last sections also apply to the pension system case. The third possibility means to make some illiquid asset classes accessible for pension funds. Examples
are private equity, insurance-linked investments and securitized loans. These are the
typical examples given.
Example
Asset managers can become more important financial actors by driving the raising of
capital and the capital deployment required to meet the demands of growing urbanization and cross-border trade. The world urban population is expected to increase by 75
percent between 2010 and 2050, from 3.6 billion to 6.3 billion. The urban profile in the
east will see many more megacities (cities with a population in excess of 10 million)
emerging. Todays number of 23 megacities will be augmented by a further 14 by 2025,
of which 12 will be in emerging markets.
This will create significant pressure on infrastructures. According to the OECD, USD
40 trillion needs to be spent on global infrastructure through 2030 to keep pace with the
growth of the global economy. Some policy makers appear to have taken the problem
on board: in Europe - after considerable debate - the European Long Term Investment
Funds (ELTIF) initiative was finally created in 2013, helping European asset managers
to invest in infrastructure. But infrastructure investments will disproportionately target
emerging markets and emerging markets asset managers have recognized this and already
started to focus on it.
Example - Metro Lima 2 project

The city of Lima intends to extend its metro network. Peru wants to narrow the existing large infrastructure deficit in connection with projects in telecommunications, water,
sanitation and many more. Therefore, the government issues bonds to finance some of the
projects. To attract investors for such a project, three main challenges are first a sound
governance of all parties involved in the project. There are more than 15 parties involved
in the project. An important one for investors is the grantor. Second, a project plan
linked to payments. The plan consists of milestones where there is a compensation for
each milestone defined in an agreement. All of the projects use a build-operate-transfer or
BOT model under which the project is eventually transferred to the government after the
private developer has been able to get his capital back plus a return. The BOT structure
incorporates RPICAO, a payment mechanism under which, by submitting a construction
progress report (CAO) to a government agency or state-owned company, a concessionaire
earns the right to receive compensation for construction costs incurred in connection with
a project. RPICAOs are denominated in either US dollars or local currency (adjusted
for inflation), and represent an irrevocable and unconditional payment obligation of the
412
CHAPTER 4.
relevant government agency or state-owned company. Chadbourne (2015). Third, legal

setup. Peru, as grantor, is the direct obligor of the RPICAO. Figure 4.38 shows the legal
entities and the cash flows.
Figure 4.38: The legal setup for the Metro Lima 2 project. Collections of tariff payments
are collected and deposited into a collection account (1). If the amount collected is not
sufficient to make the payments under RPICAO, then the grantor is required to deposit
sufficient funds (2). The Peruvian Trustee makes the payments on the RPICAO to the
Identure Trustee on behalf of the Issuer, as RPICAO titleholder (3). If a commitment
termination event occurs prior to the payment of the purchase price by the PRICAO
purchaser, the Concessionaire pays the applicable amount to the issuer. MTC represents
Peru as a grantor of the concession. [Lima Metro Line 2 Finance Ltd].
Whatever of the above measure is considered, it is evident that asset management
alone is not able to solve some of the fundament problems of pension funds which we
discussed above. At its best, the asset management function can help to reduce some
costs or to improve the likelihood of higher investment returns. But it cannot produce
what it is not possible - this means, to solve the problem of the demographic change.
But the asset management function can play an important role in two other aspects.
First, it can provide solutions for the baby-boomers with their asset decumulation needs.
Second, asset management will be central for increased private savings of individuals due
to the weakness of the first and second pillar. The growth of the future AuM will arise
much more from this channel than from the traditional pension fund channel.
4.12.4
413
Investment Consultants
Investment consultants play an important role as intermediaries, in particular for institutional investors and pension funds. They offer the following services: asset/liability
modelling, strategic asset allocation, benchmark selection, fund manager selection, and
performance monitoring. Goyal and Wahal (2008) estimate that 82 percent of US public
plan sponsors use investment consultants, as do 50 percent of corporate sponsors. Investment consultants have largely avoided the attention of academics with one notable
exception - Jenkinson et al. (2014). A recent survey by Pension and Investments [2013]
found that 94 percent of plan sponsors employed investment consultants. The five leading
investment consultants worldwide - ranked by assets under advisement - in 2011 were
Hewitt Ennis Knupp (USD 4.4 trillion), Mercer (USD 4.0 trillion), Cambridge Associates
(USD 2.5 trillion), Russell Investments (USD 2.4 trillion), and Towers Watson (USD 2.1
trillion).
Jenkinson et al. (2014) ask the following questions: What drives investment consultants recommendations of institutional funds? What impact do these recommendations
have on flows? Do recommendations add value for plan sponsors?
The authors use data from eVestment and limit their analysis to US long-only equity
products, which can be considered to be among the efficient markets. In the approximate period 1999 to 2011, one-quarter of these products were recommended annually by
investment consultants and the rest were not recommended. This much larger number
of recommended products compared to the non-recommended ones remains stable in the
different years studied.
The authors find, the first question, that consultants recommendations are partly
driven by past fund performance, but also by other soft factors such as service quality and
investment quality factors, Jenkinson et al. (2014): to be recommended it is not sufficient to have a strong return history. The authors then analyze whether the size of the
fees charged has an impact on the recommendation rate. If this were the case, conflicts
of interest would be suspected. The analysis shows that this is not the case. Fees are
very similar for recommended and non-recommended products independent of the size of
the products and their styles (growth, value, small- and mid- cap). The fees are in line
with the fees in Section 4.3.5.3 - that is to say, close to 70 bps for larger products.
Recommendations, and in particular changes in recommendations, have a strong impact on product flows (question 2): Moving from zero-recommendation to the case where
all consultants recommend leads to an additional inflow of assets of USD 2.4 billion. On
a percentage basis, on average the extra inflow equals 29 percent of the assets managed
by that product in the previous year, compared to a not shortlisted product.
The answer to the third question created a lot of public attention. They construct
equal- and value-weighted portfolio returns of recommended and non-recommended prod-
414
CHAPTER 4.
ucts. Using the returns of these portfolio they estimate one- (the CAPM), three- (FF),
and four- factor (FFC) alphas and excess returns over portfolios of selected benchmarks.
For the equally weighted portfolios, the returns of the recommended products were
significantly lower than those of the non-recommended ones by the order of 1 percent in
magnitude, independent of the factor model chosen (see Figure 4.39). For value-weighted
portfolios, different factor models lead to different returns for the two alternatives. Valueweighted returns and alphas are consistently lower, suggesting that smaller products perform relatively better. Jenkinson et al. (2014). Summarizing the evidence: investment
consultants are not able consistently to add value by selecting superior investment products.
The underperformance of recommended products in the equally weighted case could
be explained by the tendency of consultants to recommend large products that perform
worse. However, after adjusting for different sizes, the explanation turns out to be wrong.
These results raise several questions. First, why do pension funds use - on a rational
basis - investment consultants that add no value? The argument, that consultants act as
insurance against being sued is simply not justifiable. Second, it is difficult to understand
why investment consultants are virtually unregulated in most jurisdictions.
4.13
Trends - Uniformity of Minds
Technology not only connects the different worldwide market places, it also allows information to spread about any given local event, without delay, to the rest of the world.
This fact may also homogenize the way in which people think and make decisions in geographically and culturally different places. Is such an alignment of minds taking place,
and - if so - what are the possible consequences? We follow Bacchetta and van Wincoop
(2013) and Bacchetta et al. (2013), all of whom compare the GRF or Great Recession
of 2008 with the global economic recession (Great Depression) of the 1930s.
4.13.1
The Great Depression and the Great Recession
Figure 4.40 compares the economic impact on the US and non-US economies during the
GFC and the Great Depression.
There was basically no difference during the Great Recession between the GDP
growth in the US and that in the G20 states representing the main worldwide economy without the US. But in the Great Depression, the decline in US GDP growth did
not spread with comparable intensity to the rest of the world. This indicates that while
the Great Recession can be called a global crisis, the Great Depression was more local
in nature. The authors show that the Great Recession was, in historical terms, the first
global recession. The first question is: How could the crisis spread from the US financial
sector to the US real sector? The second question is: Why did the Great Recession
spread almost instantaneously from the US economy to the global economy - how did
4.13. TRENDS - UNIFORMITY OF MINDS
415
Figure 4.39: The table shows the performance of portfolios of actively managed US equity
products that experience a net increase (decrease) in the number of recommendations in
the twelve or twenty-four month period following the recommendation change. Performance is measured using raw returns; returns in excess of a benchmark chosen to match
the product style and market capitalization; and one-, three-, and four-factor alphas
(corresponding to the CAPM, the Fama - French three-factor model, and the Fama French - Carhart model). Excess returns and alphas are expressed in percent per year.
All reported figures are gross of fees. The first part of the table shows the results for
equally weighted portfolios of products whereas the second part of the table shows the
same statistics for portfolios of products weighted using total net assets at the end of the
previous year. t-statistics based on standard errors - robust to conditional heteroscedasticity and serial correlation of up to two lags as in Newey and West (1987) - are reported
in parentheses. ***, **, * mean statistically significant at the 1, 5, and 10 percent levels,
respectively. The benchmarks for the investment products are the corresponding Russell indices. Investment product large cap growth is benchmarked by the Russell 1000
Growth, the small cap value by the Russell 2000 Value, etc. (Jenkinson et al. [2014]).
the recession become a global one?
4.13.2
Uniformity of Minds
Bacchetta and van Wincoop (2013) show that standard macroeconomic approaches fail
to provide convincing explanations. Before we turn to the global issue, we reconsider
the US. First, one can consider - for the US - direct effects of the financial sector on the
real economy. Examples of these direct effects include broken financial intermediation
416
CHAPTER 4.
Figure 4.40: Comparing global GDP growth (pecent, annual, real) in the Great Recession
and Great Depression for the US and developed non-US countries (Bacchetta and van
Wincoop [2013]).
leading to a credit crunch or stock market declines leading to negative wealth effects.
While such explanations sound convincing, they are flawed due to the main methodical
problem of the GFC not being exogenous but endogenous in the macroeconomic cycle.
That is to say, the impact of the financial crisis is not a separated output variable acting
on the economy but is part of the whole economy and must therefore impact the real
economy.
As many authors have shown, the financial crisis was part of the so-called boom bust cycle of the real economy. Of particular importance are real-estate boom - bust
cycles. Reinhart and Rogoff (2008) illustrate the following pattern. Set T to be the date
of a banking crisis. Consider the growth rate of the real-estate asset class some years
before and after this date. One typically observes that before T prices increase and that
they fall after or shortly before the banking crisis. In this sense, a financial crisis is part
of a boom - bust cycle. The surprising aspect of the most recent crisis was not that
it happened, but that such a crisis could be strong enough to destabilize the financial
system of a developed economy (the US, here).
Given this US view, how could the recession become a global one? The standard
channel for explaining global linkages is trade. But the US is not a very open economy,
and imports - for many countries - to the US are relatively small. There is no empirical
4.13. TRENDS - UNIFORMITY OF MINDS
417
evidence of a link between openness in terms of trade and a decline in growth. Hence,
the macroeconomic trade channel fails to provide an answer to the question of how the
recession spread globally. Another possible channel is the financial channel. That is
to say, the decline is asset prices and real-estate prices and changes to the credit supply
channeled into the real economies outside of the US. But this hypothesis is not supported
by empirical evidence either. While real-estate prices dropped in, say, Spain and Ireland,
they did not in Germany or Switzerland. While Switzerland has a much stronger financial link to the US than do most European countries, the European countries were much
more affected by the Great Recession. While some countries faced a decline in credit
supply, others did not. Although policy makers have often used the expression credit
crunch, firms participating in surveys about the period have indicated that - during the
Great Recession - lower demand was more important to them than reduced credit supply.
Summarizing, standard macroeconomic models cannot explain the global recession.
Bacchetta et al. (2013) argue that there must have been other drivers that caused
the global recession. They argue that it was not the globalization of the economy, as
considered above, but rather the globalization of how individuals form expectations that
was responsible for the recession spreading worldwide. This argument is, of course,
linked to questions of information technology, information transmission, and information
quality in worldwide terms. In contrast with the past, information today is spread almost
in real time around the world, it is more difficult to control information distribution, and
mainstream information is mostly costless to the consumer. Therefore, one can argue
that - given a financial crisis and its related information flow - individuals around the
world had access to similar information sets upon which to form their expectations. The
authors claim that panic, by consumers and firms throughout the world, lead to declines
in aggregated demand in most countries. Such panic must show a systemic component
to have a worldwide impact. They assume therefore that such panic is rational or selffulfilling:
Agents first expect low future income due to the information available and uncertainty at play at the beginning of the financial crisis.
This leads to low current consumption.
This reduction in consumption lowers firms current profits.
This leads to low future production and income, which matches the agents expectations as outlined in the first step.
418
CHAPTER 4.
Chapter 5
Appendix
Fund name
SPDR S&P 500 ETF
Vanguard 500 Index
Vanguard TSM Idx
Vanguard TSM Idx
Vanguard Instl Indx
PIMCO:Tot Rtn
Vanguard TSM Idx
Vanguard Instl Indx
Fidelity Contrafund
American Funds Gro
American Funds In
American Funds CIB
iShares:Core S&P 500
Dodge & Cox Intl Stock
Vanguard Wellington
Dodge & Cox Stock
American Funds ICA
American Funds CWGI
iShares:MSCI EAFE ETF
Vanguard Tot Bd II
Franklin Cust:Inc
Vanguard Tot Bd
American Funds Wash
Vanguard Tot I Stk
Vanguard TSM Idx;ETF
Largest Mutual Funds and ETF

Total Net Assets in $ mn Performance 5y in % End 2014
200167
n/a
142500
15.84
120162
15.97
116836
16.11
102422
15.85
99799
n/a
92294
16.12
85213
15.87
77083
15.58
74241
13.98
73855
11.42
71468
9.25
70463
15.79
66464
16.61
64341
11.48
60275
15.79
60255
13.83
57569
9.65
54486
5.54
54432
n/a
53520
9.56
53451
4.04
53207
15.07
51562
10.32
50550
16.11
Table 5.1: Source: www.diansfundfreebies.com, date 11.12.2014.

419
420
CHAPTER 5. APPENDIX
Largest Hedge Funds
Hedge Fund
Country
Bridgewater Associates
USA
J.P. Morgan Asset Management
USA
Brevan Howard Capital Management UK
Och-Ziff Capital Management
USA
BlueCrest Capital
UK
BlackRock
USA
USA
AQR Capital Management
Lone Pine Capital
USA
Man Group, London
UK
Viking Global Investors
USA
Baupost Group
USA
Adage Capital Management
USA
Winton Capital Management
UK
GAM Holding, London
UK
Renaissance Technologies
USA
Elliott Management Corporation
USA
D.E. Shaw & Co.
USA
Davidson Kempner Capital Mgt.
USA
Millenium Management
USA
USA
Paulson & Co.
Farallon Capital Management
USA
King Street Capital Management
USA
Appaloosa Management
USA
Canyon Capital Advisors
USA
Two Sigma Investments/Advisers
USA
Total Net Assets in $ mn

87100
59000
40000
36100
32600
31323
29900
29000
28300
27100
26800
25000
24700
24400
24000
23300
22200
22000
21000
20325
19800
19800
19300
17800
17500
Table 5.2: Source J.P. Morgan, 2014.
421
Rank
1
2
3
4
5
6
7
8
9
10
11
12
Largest Custodians
Provider
Assets under custody USD bn
BNY Mellon
28,300
J.P. Morgan
21,000
State Street
20,996
Citi
14,700
BNP Paribas
9,447
HSBC Securities Services
6,210
Northern Trust
5,910
Societe Generale
4,915
Brown Brothers Harriman 3,800
UBS AG
3,438
SIX Securities Services
3,247
CACEIS
3,200
Table 5.3: Source: globalcustody.net.
Reference date
Sep 30, 2014
Mar 31, 2014
Mar 31, 2014
Mar 31, 2014
Jun 30, 2014
Dec 31, 2013
Sep 30, 2014
Sep 30, 2014
Mar 31, 2014
Sep 30, 2014
Dec 31, 2013
Dec 31, 2013
422
CHAPTER 5. APPENDIX
Chapter 6
References
1. Accenture, Digital Business Era: Stretch Your Boundaries, Accenture Technology Vision
2015, 2015.
2. C. Ackermann, R. McEnally and D. Ravenscraft, D. (1999). The Performance of Hedge
Funds: Risk, Return, and ncentives. Journal of Finance, 833-874, 1999.
3. V. Agarwal, N.D. Daniel and N.Y. Naik, Role of Managerial Incentives and Discretion in
Hedge Fund Performance. The Journal of Finance, 64(5), 2221-2256, 20094. G.S. Amin and H.M. Kat, Hedge Fund Performance 1990 - 2000: Do the Money Machines
Really add Value?, Journal of financial and quantitative analysis, 38(02), 251-274, 2003.
5. R.M. Anderson, S.W. Bianchi and L.R. Goldberg, Determinants of Levered Portfolio Performance, Forthcoming Financial Analysts Journal, UCLA at Berkeley, 2014.
6. A. Ang, Mean-Variance Investing, Lecture Notes Columbia University, published in ssrn.com,
2012.
7. A. Ang, Asset Management. A Systematic Approach to Factor Investing, Oxford University Press, 2014.
8. A. Ang, W. Goetzmann, and S. Schaefer, Evaluation of Active Management of the Norwegian GPFG, Norway: Ministry of Finance, 2009. This is also called the Professors
Report.
9. A. Ang, S. Gorovyy and G.B. Van Inwegen, Hedge Fund leverage. Journal of Financial
Economics, 102(1), 102-126, 2011.
10. A. M. Antonopoulos, Mastering Bitcoin, OReilly Books, New York, 2015.
11. F. Allen and D. Gale, Financial Markets, Intermediaries and Intertemporal Smoothing, J.
Pol. Econom., 105, 523-546, 1997.
12. A. Artzner, F. Delbaen, J.-M. Eber and D. Heaths, Coherent Measures of Risk, Mathematical Finance, 9(3), 203-228, 1999.
13. T. Aste, Blockchain, University College London, Center for Blockchain Technologies,
preprint SSRN, 2016.
14. C. S. Asness, Hedge Funds: The (Somewhat Tepid) Defense, AQR, October 24, 2014.
423
424
CHAPTER 6. REFERENCES
15. C.S. Asness, How Can a Strategy Still Work if Everyone Knows About it? International
Invest Magazine, September, 2015.
16. C.S. Asness and J. Liew, The Great Divide of Market Efficiency, Institutional Investor,
March 03, 2014.
17. C.S. Asness, A. Frazzini, R. Israel and T. Mokowitz, Fact, Fiction, and Value Investing,
Forthcoming, Journal of Portfolio Management, Fall 2015, 2015.
18. V. Agarwal, N. D. Daniel, and N. Y. Naik, Do Hedge Funds Manage Their Reported
Returns?, Review of Financial Studies, forthcoming, 2011.
19. V. Agarwal and N.Y. Naik, Multi-Period Performance Persistence Analysis of Hedge Funds.
Journal of financial and quantitative analysis, 35(03), 327-342, 2000.
20. F. Allen, J. Barth and G. Yago, Fixing the Housing Market: Financial Innovations for
the Future, Wharton School Publishing-Milken Institute Series on Financial Innovations,
Upper Saddle River, NJ: Pearson Education, 2012.
21. F. Allen and G. Yago, Financing the Futures. Market-Based Innovations for Growth.
Wharton School of Publishing and Milken Institute, 2012.
22. G.O. Aragon and J.S. Martin, A Unique View of Hedge Fund Derivatives Usage: Safeguard
or Speculation?. Journal of Financial Economics, 105(2), 436-456, 2012.
23. Assenagon Asset Management, 1. Assenagon Derivatetag am See, 2013.
24. M. Avellaneda and D. Dobi, Structural Slippage of Leveraged ETFs. Available at SSRN
2127738, 2012.
25. D. Avramov, R. Kosowski, N.Y. Naik and M. Teo, Hedge Funds, Managerial Skill, and
Macroeconomic Variables. Journal of Financial Economics, 99(3), 672-692, 2011.
26. Ph. Bacchetta, C. Tille and E. van Wincoop, Self-Fulfilling Risk Panics, American Economic Review 102, 3674-3700, 2013.
27. K. E. Back, Asset Pricing and Portfolio Choice Theory, Oxford University Press, 2010.
28. Bank of England, The Economics of Digital Currencies, Quarterly Bulleting, Q3, 2014.
29. M. Baker, B. Bradley and J. Wurgler, Benchmarks as Limits to Arbitrage: Understanding
the Low-Volatility Anomaly, Financial Analysts Journal, 67(1):40-54, 2011.
30. N. Barberis and A. Shleifer, Style Investing, Journal of Financial Economics 68 (2), 181-99,
2003.
31. L. Barras, O. Scaillet, and R. Wermers, False Discoveries in Mutual Fund Performance:
Measuring Luck in Estimated Alphas, The Journal of Finance 65.1, 179-216, 2010.
32. G. Baquero and M. Verbeek, A Portrait of Hedge Fund Investors: Flows. Performance
and Smart Money, SSRN, 2005.
33. P.A. Bares, R. Gibson and Gyger, Performance in the Hedge Funds Industry: An Analysis
of Short-and Long-Term Persistence, The Journal of Alternative Investments, 6(3), 25-41,
2003.
34. I. Ben-David, F. Franzoni, A. Landier and R. Moussawi, 2012, Do Hedge Funds Manipulate
Stock Prices, Fisher College of Business Working Paper Series.
425
35. R. Berentsen and F. Schaer, Bitcoin: A Currency Here to Stay?, Swiss Finance Institute
Seminar, Zurich, October, 2014.
36. P. L. Bernstein, Wimps and Consequences, The Journal of Portfolio Management, p.1,
1999.
37. Black Rock, ETF landscape: Global Handbook Q1, 2011.
38. F. Black and R. Litterman, Robert, Asset Allocation: Combining Investor Views with
Market Equilibrium, Goldman Sachs Fixed Income Research Note, September, 1990.
39. R. B. Bliss and R. Steigerwald, Derivatives Clearing and Settlement: A Comparison of
Central Counterparties and Alternative Structures, Economic Perspectives, 30(4), 2006.
40. D. Blitz, Strategic Allocation to Premiums in the Equity Market, ssrn.com, 2011.
41. J.-P. Bouchaud and M. Potters, Financial Applications of Random Matrix Theory: a Short
Review, arXiv preprint arXiv:0910.1205, 2009.
42. D. Blitz, Is Rebalancing the Source of Factor Premiums?, The Journal of Portfolio Management, Summer 2015, 2015.
43. M.W. Brandt, Portfolio Choice Problems, Brandt, in Y. Ait-Sahalia and L.P. Hansen
(eds.), Handbook of Financial Econometrics, Volume 1: Tools and Techniques, North
Holland, 269-336, 2010.
44. M. Brenner and Y. Izhakian, Asset Prices and Ambiguity: Empirical Evidance, Stern
School of Business, Finance Working Paper Series, FIN-11-10, 2011.
45. R. Brian, F. Nielsen and D. Steffek, Portfolio of Risk Premia: A New Approach to Diversification, MSCI Barra Research Insights, 2009.
46. R. G. Brown, J. Carlyle, I. Grigg and M. Hearn, Corda: An Introduction, squarespace.com,
2016.
47. S.J. Brown, W. Goetzmann, R.G. Ibbotson and S.A. Ross, Survivorship Bias in Performance Studies, Review of Financial Studies, 5(4), 553-580, 1992.
48. S.J. Brown, W. Goetzmann and R.G. Ibbotson, Offshore Hedge Funds: Survival and
Performance, 1989-95, Journal of Business, 72(1), 1999.
49. S.J. Brown, W. Goetzmann and J.M. Park, Conditions for Survival: Changing Risk and
the Performance of Hedge Fund Managers and CTAs, SSRN, 1999.
50. B. Bruder, N. Gaussel, J.-C. Richard and T. Roncalli, Regularization of Portfolio Allocation, Lyxor White Paper Series, 10, 2013.
51. A. Corbellini, Elliptic Curve Cryptography: A Gentle Introduction, webpage of A. Corbellini, 2015.
52. R.J. Caballero, Macroeconomics after the Crisis: Time to Deal with the Pretense-ofKnowledge Syndrome, Journal of Economic Perspectives, Volume 24, Number 4, Fall, 85
- 102, 2010.
53. R.J. Caballero and A. Krishnamurthy, Collective risk management in a flight to quality
episode. The Journal of Finance, 63(5), 2195-2230, 2008.
54. C. Camerer, G. Loewenstein, and D. Prelec. Neuroeconomics: How Neuroscience can
Inform Economics. Journal of economic Literature: 9-64, 2005.
426
55. J.Y. Campbell and L. M. Viceira, Strategic Asset Allocation: Portfolio Choice for LongTerm Investors, books.gooble.com; 2002.
56. C. Cao, Y. Chen, B. Liang and A.W. Lo, Can Hedge Funds Time Market Liquidity?,
Journal of Financial Economics, 109(2), 493-516, 2013.
57. M.M. Carhart, On Persistence in Mutual Fund Performance, The Journal of finance, 52(1),
57-82, 1997.
58. Z. Cazalet and T. Roncalli, Style Analysis and Mutual Fund Performance Measurement
Revisited, Lyxor Research Paper, 2014.
59. Y. Chen, Timing Ability in the Focus Market of Hedge Funds, Journal of Investment
Management, 5(2), 66, 2007.
60. Y. Chen, Derivatives Use and Risk Taking: Evidence from the Hedge Fund industry,
Journal of Financial and Quantitative Analysis, 46(04), 1073-1106, 2011.
61. CEM Benchmarking, CEM Toronto, 2014.
62. M.M. Christensen, On the History of the Growth Optimal Portfolio, University Southern
Denmark, Preprint, 2005.
63. J. Cochrane, Asset Pricing, Princeton University Press, 2005.
64. J. Cochrane, The Dog That Did Not Bark: A Defense of Return Predictability, Review of
Financial Studies 21 (4): 1533 - 75, 2077.
65. J. Cochrane, Discount Rates, Presidential Address AFA 2010, Journal of Finance, Vol
LXVI, 4, August, 2011.
66. N. Cuche-Curti, O. Sigrist and F. Boucard, Blockchain: An Introduction, Research and
Policy Notes, Swiss National Bank, 2016.
67. C.Culp and J. Cochrane, Equilibrium Asset Pricing and Discount Factors: Overview and
Implications for Derivatives Valuation and Risk Management, Modern Risk Management:
A History. Peter Field, ed. London: Risk Books, 2003.
68. T. Dangl, O. Randl and J. Zechner, Risk Control in Asset Management: Motives and
Concepts, K. Glau et al. (eds), Innovation in Quantitative Risk Management, Springer
Proceedings in Mathematics and Statistics 99, 239-266, 2015.
69. L. Deville, Exchange Traded Funds: History, Trading, and Research, Handbook of Financial Engineering, Zopounidis, Doumpos and Pardalos (eds)., 67-99, 2007.
70. K. Daniel and T. Moskowitz, Momentum Crashes, The Q-Group: Fall Seminar, 2012.
71. K. Daniel and S. Titman, Evidence on the Characteristic of Cross Sectional Variation in
Stock Returns, Journal of Finance 55 (1), 380-406, 1997.
72. Deutsche Bank, Equity Risk Premia, Deutsche Bank London, February, 2015.
73. Deutsche Bank, A New Asset Allocation Paradigm, Deutsche Bank London, July, 2012.
74. F.X. Diebold, A. Hickman, A. Inoue, and T. Schuermann, Converting 1-Day Volatility to
h-Day Volatility: Scaling by Root-h is Worse than You Think, Risk, 11, 104-107, 1998.
75. D. Dobi and M. Avellaneda, Structural Slippage of Leveraged ETFs, Preprint NYU, 2012.
427
76. J. Dow and S. R. d. C.Werlang, Uncertainty Aversion, Risk Aversion, and the Optimal
Choice of Portfolio, Econometrica, Vol. 60, No. 1, 197 - 204, 1992.
77. M. Dudler, B. Gmr and S. Malamud, Risk-Adjusted Time Series Momentum, Working
Paper, 2014.
78. S. Duivestein, M. van Doorn, T. van manen, J. Bloem and E. van Ommeren, Design to
Disrupt, Blockchain: Cryptoplatform for a Frictionless Economy, SogetiLabs, 2016.
79. Edwards, F. R. and Caglayan, M. O. (2001). Hedge Fund Performance and manager skill.
Available at SSRN 281524.
80. D. Ellsberg, Risk, Ambiguity, and the Savage Axioms, Quarterly Journal of Economics,
75, 643-669, 1961.
81. E.J. Elton and M. J. Gruber, Risk Reduction and Portfolio Size: An Analytical Solution,
Journal of Business: 415-437, 1977.
82. Ernst & Young, Whats new? Innovation for Asset Management, 2012 Survey, 2012.
83. Ethereum, www.ethereum.org, 2016.
84. F. Fabozzi, R. J. Shiller, and R. Tunaru, Hedging Real-Estate Risk, working paper 09-12,
Yale International Center for Finance, 2009.
85. M. Faber, A Quantitative Approach to Tactical Asset Allocation. Journal of Wealth
Management 9 (4), 69 - 79, 2007.
86. E.F. Fama, The Behavior of Stock Market Prices, Journal of Business, 38, 34-101, 1965.
87. E.F. Fama, Efficient Capital Markets: A Review of Theory and Empirical Work, Journal
of Finance 25, 383 - 417, 1970.
88. E.F. Fama, Efficient Markets: II, Journal of Finance, 46(5), 1575-1618, 1991.
89. E.F. Fama and K. R. French, Permanent and Temporary Components of Stock Prices,
Journal of Political Economy 96: (2): 246 - 67. 1988.
90. E.F. Fama and K.R. French, Disagreement, Tastes, and Asset Prices, Journal of Financial
Economics 83 (3), 667-89, 2007.
91. E.F. Fama and K.R. French, A Five-Factor Asset Pricing Model, Journal of Financial
Economics, 116, 1-22, 2015.
92. B. Fastrich, S. Paterlini and P. Winker, Constructing Optimal Sparse Portfolios Using
Regularization Methods, ssrn, 2013.
93. A. Frazzini and L. H. Pedersen, Betting Against Beta, Journal of Financial Economics
111.1, 1-25, 2014.
94. P. Franco, Understanding Bitcoin: Cryptography, Engineering and Economics. John Wiley& Sons, 2014.
95. J. Freire, Massive Data Analysis: Course Overview, NYU School of Engineering, 2015.
96. W. Fung, D.A. Hsieh, N.Y. Naik and R. Ramadorai, Hedge Funds: Performance, Risk,
and Capital Formation, The Journal of Finance, 63(4), 1777-1803, 2008.
97. W. Fung and D.A. Hsieh, Empirical Characteristics of Dynamic Trading Strategies: The
Case of Hedge Funds, Review of financial studies, 10(2), 275-302, 1997.
428
98. W. Gale and R. Levine, Financial Literacy: What Works? How could it be more Effective,
Financial Security Project, Boston College, 2011.
99. M. Gao and J. Huang, Capitalizing on Capitol Hill: Informed Trading by Hedge Fund
Managers, In Fifth Singapore International Conference on Finance, 2011.
100. C. R. Genovese, A Tutorial on False Discovery Control, Carnegie Mellon University, 2004.
101. D.M. Geltner and J. Fisher, Pricing and Index Considerations in Commercial Real Estate
Derivatives Journal of Portfolio Management Special Issue: Real Estate, 1 - 21, 2007.
102. M. Getmansky, B. Liang, C. Schwarz and R. Wermers, Share Restrictions and Investor
Flows in the Hedge Fund Industry, Working Paper, University of Massachusetts, Amherst,
2015.
103. M. Getmansky, M.P. Lee, and A. Lo, Hedge Funds: A Dynamic Industry In Transition,
NBER, 2015.
104. G. Gigerenzer and G.Goldstein, Reasoning the Fast and Frugal Way: Models of Bounded
Rationality, in Heuristics: The Foundations of Adaptive Behavior, eds Gigerenzer G.,
Hertwig R., Pachur T., editors. (New York: Oxford University Press; ), 31-57, 2011.
105. C. Gini, Measurement of Inequality of Incomes, The Economic Journal: 124-126, 1921.
106. P. W. Glimcher, and E. Fehr, eds. Neuroeconomics: Decision making and the brain.
Academic Press, 2013.
107. W.N. Goetzmann, J.E. Ingersoll and S.A. Ross, High-water Marks and Hedge Fund Management Contracts, Journal of Finance 58, 1685 - 1717, 2003.
108. W.N. Goetzmann and A. Kumar, Equity Portfolio Diversification, Review of Finance, Vol.
12, No. 3, 433 - 463, 2008.
109. W.N. Goetzmann and K. Rouwenhorst, The History of Financial Innovation, Carbon Finance Spearker Series at Yale, 2007.
110. A. Goyal and S. Wahal The Selection and Termination of Investment Management Firms
by Plan Sponsors, Journal of Finance 63, 1805 - 1847, 2008.
111. M. Grinblatt and S.Titman, Mutual Fund Performance: An analysis of quarterly portfolio
holdings, Journal of business: 393-416, 1989.
112. R.C. Grinold, The Fundamental Law of Active Management, The Journal of Portfolio
Management 15.3, 30-37, 1989.
113. R.C. Grinold and R.N. Kahn, Active Portfolio Management. A Quantitative Approach
for Providing Superior Returns and Controlling Risk, McGraw-Hill, Second Edition, New
York, 2000.
114. S.J. Grossman and J. E. Stiglitz, On the Impossibility of Informationally Efficient Markets.
The American economic review: 393-408, 1980.
115. W. Hallerbach, Disentangling Rebalancing Return, Journal of Asset Management, 15, 301316, 2014.
116. L. Hansen and T. Sargent, Robust Control and Model Uncertainty. American Economic
Review 91 (2), 60-66, University Press, 2008.
429
117. C.R. Harvey, Y. Liu and H. Zhu, The Cross-Section of Expected Returns, Working Paper
SSRN, 2015.
118. J. Hasanhodzic, A. W. Lo, and E. Viola, Is It Real, or Is It Randomized?: A Financial
Turing Test, MIT Working Papers, 2010.
119. M. Hassine and R. Roncalli, Measuring Performance of Exchange Traded Funds. SSRN,
2013.
120. C. Harvey and A. Siddique, Conditional Skewness in Asset Pricing Tests, Journal of Finance, 55:1263-1295, 2000.
121. R. Haugen and A. Heins, Risk and the Rate of Return on Financial Assets: Some old Wine
in new Bottles, Journal of Financial and Quantitative Analysis, 10:775-784, 1975.
122. S. Hayley, Diversification Returns, Rebalancing Returns and Volatility Pumping, City
University London, 2015.
123. J. M. Griffin, Are the Fama and French Factors Global or Country Specific?, Review of
Financial Studies, 15(3), 783-803, 2002.
124. G. He and R. Litterman, The Intuition Behind Black-Litterman Model Portfolios, Goldman
Sachs Asset Management Working paper, 1999.
125. R.D. Henriksson and R.C. Merton, On Market Timing and Investment Performance. II.
Statistical Procedures for Evaluating Forecasting Skills, Journal of business, 513-533, 1981.
126. O.C. Herfindahl, Concentration in the Steel Industry, Diss. Columbia University, 1950.
127. U. Herold, Portfolio Construction with Qualitative Forecasts, Journal of Portfolio Management, Fall 2003, 61-72, 2003.
128. E. Hjalmarsson, Portfolio Diversification Across Characteristics, The Journal of Investing,
Vol. 20, No. 4, 2011.
129. S. Holden and J. VanDerhei, 401 (k) Plan Asset Allocation, Account Balances, and Loan
Activity in 2003, Investment Company Institute, Perspective, Vol. 6, No. 1., 2004.
130. G. Huberman and Z. Wang, Arbitrage Pricing Theory, Federal Reserve Bank of New York
Staff Reports, Staff Report no.216, 2005.
131. J. Huij and M. Verbeek, On The Use of Multifactor Models to Evaluate Mutual Fund
Performance , Financial Management, 38(1), 75-102, 2009.
132. M. Hulbert, The Prescient are Few, New York Times, July 13, 2008.
133. R.G. Ibbotson, P. Chen and K.X. Zhu, The ABCs of Hedge Funds: Alphas, Betas, and
Costs, Financial Analysts Journal, 67(1), 15-25, 2011.
134. T. Idzorek, A Step-By-Step guide to the Black-Litterman Model, Incorporating UserSpecified Confidence Levels, Working paper, 2005.
135. T. Idzore, and M. Kowara, Factor-Based Asset Allocation vs. Asset-Class-Based Asset
Allocation, Financial Analysts Journal, Vol. 69 (3), 2013.
136. A. Ilmanen, Expected Returns: An Investors Guide to Harvesting Market Rewards, Wiley
Finance, 2011.
137. A. Ilmanen and J. Kizer, The Death of Diversification Has Been Greatly Exaggerated, The
Journal of Portfolio Management, Vol. 38, No. 3, 2012.
430
138. Investment Company Institute, Profile of Mutual Fund Shareholders, 2014, ICI Research
Report, 2014.
139. R. Jagannathan and T. Ma, Risk Reduction in Large Portfolios: Why Imposing the Wrong
Constraints Helps, Journal of Finance 58, 1651 - 1684, 2003.
140. R. Jagannathan, A. Malakhov and D. Novikov, Do Hot Hands Exist Among Hedge Fund
Managers? An Empirical Evaluation. The Journal of Finance, 65(1), 217-255, 2010.
141. N. Jegadeesh and S. Titman, Profitability of Momentum Strategies: An Evaluation of
Alternative Explanations. The Journal of Finance, 56(2), 699-720, 2001.
142. T. Jenkinson, H. Jones and J.V. Martinez, Picking winners? Investment consultants
recommendations of fund managers, Forthcoming Journal of Finance, 2014.
143. M.C. Jensen, Some Anomalous Evidence Regarding Market Efficiency, Journal of Financial
Economics, 6, 95-101, 1978.
144. H. Jiang and B. Kelly, Tail risk and Hedge Fund Returns, Chicago Booth Research Paper,
(12-44), 2012.
145. B. Jones, Re-thinking Asset Allocation - The Role of Risk Factor Diversification, Deutsche
Bank Macro Investment Strategy, September 2011.
146. B. Jones, Rethinking Portfolio Construction and Risk Management, Deutsche Bank Macro
Investment Strategy, January 2012.
147. JP Morgan and Oliver Wyman, Unlocking Economic Advantage with Blockchain. A Guide
for Asset Managers, 2016.
148. Khan Academy, https://www.khanacademy.org.
149. D. Kahneman, Thinking Fast and Slow. New York: Farrar, Straus and Giroux, 2011.
150. D. Kahneman and A. Tversky, Prospect Theory: An Analysis of Decision under Uncertainty, Econometrica 47: 236 - 91, 1979.
151. S. Kandel and R. F. Stambaugh, On the Predictability of Stock Returns: An AssetAllocation Perspective, The Journal of Finance, Vol LI, No. 2, 385-424, 1996.
152. H. Kaya, W. Lee and Y. Wan, Risk Budgeting with Asset Class and Risk Class Approaches, The Journal of Investing, Vol, 21, No. 1, 2012.
153. F. Knight, Risk, Uncertainty, and Profit, New York: Houghton Mifflin, 1921.
154. M.P. Kritzman, Puzzles of Finance: Six Practical Problems and Their Remarkable Solutions, John Wiley, New York, NY, 2000.
155. R. Kunz, Asset Management, DAS in Banking and Finance, SFI, 2014.
156. Y.K. Kwok, Lecture Notes, University of Hong Kong, 2010.
157. E. Jurczenko and J. Teiletche, Active Risk-Based Investing, Working Paper SSRN, 2015.
158. C.H. Lanter, Institutional Portfolio Management, Swiss Finance Institute, Asset Management Program, 2015.
159. O. Ledoit and M. Wolf, Improved Estimation of the Covariance Matrix of Stock Returns
with an Application to Portfolio Selection, Journal of Empirical Finance, 10(5), 603-621,
2003.
431
160. W. Lee, Advanced Theory and Methodology of Tactical Asset Allocation, Duke University,
2000.
161. W. Lee and D.Y. Lam, Implementing Optimal Risk Budgeting, The Journal of Portfolio
Management, 28, 1, 73-80, 2001.
162. B. Lehmann and D.M. Modes, Mutual Fund Performance Evaluation: a Comparison of
Benchmarks and Benchmarks Comparisons, Journal of Finance, pp. 233 - 265 June, 1987.
163. M. Leippold, Resampling and Robust Portfolio Optimization, Lecture Notes University of
Zurich, 2010.
164. M. Leippold, Asset Management, Lecture Notes University of Zurich, 2011.
165. J. Lewellen, S. Nagel and J. Shanken, A Sceptical Appraisal of Asset Pricing Tests, Journal
of Financial Economics 96, 175-194, 2010.
166. H. Li, X. Zhang and R. Zhao, Investing in Talents: Manager Characteristics and Hedge
Fund Performance, Journal of Financial and Quantitative Analysis, 46(01), 59-82, 2011.
167. B. Liang, Hedge Funds: The Living and the Dead. Journal of Financial and Quantitative
Analysis, 35(03), 309-326, 2000.
168. C.-Y. Lin, Big Data Analytics, Lecture Notes, University of Columbia, 2015.
169. A. Lo, Data-Snooping Biases in Financial analysis. AIMR Conference Proceedings. Vol.
1994. No. 9. Association for Investment Management and Research, 1994.
170. A. Lo, Efficient Markets Hypothesis, The New Palgrave: A Dictionary of Economics, L.
Blume, S. Durlauf, eds., 2nd Edition, Palgrave Macmillan Ltd., 2007.
171. D. Luenberger, Projection Pricing, Stanford University, researchgate.net, 2014.
172. F. Maccheroni, M. Marinacci and D. Ruffino, Alpha as Ambiguity: Robust Mean-Variance
Portfolio Analysis, Econometrica. Volume 81, Issue 3, pages 1075 - 1113, May, 2013.
173. G. Magnus, The Age of Ageing: Global Demographics, Destinies, and Coping Mechanisms,
First webcast: The Conference Board, 2013.
174. D. Mahringer, W. Pohl and P. Vanini, Structured Products: Performance, Costs and
Investments, SFI White Papers, 2015.
175. S. Maillard, T. Roncalli and J. Teiletche, On the Properties of Equally-Weighted Risk
Contributions Portfolios, ssrn 1271972, 2008.
176. B.G. Malkiel, The Efficient Market Hypothesis and Its Critics, Journal of economic perspectives, 59-82, 2003.
177. B.G. Malkiel and A. Saha, Hedge Funds: Risk and Return, Financial analyst journal,
61(6), 80-88, 2005.
178. L. Martellini and V. Milhau, Factor Investing: A Welfare-Improving New Investment
Paradigm or Yet Another Marketing Fad? EDHEC-Risk Institute Publication, July, 2015.
179. W. Marty, Portfolio Analytics. An Introduction to Return and Risk Measurement, Springer
Texts in Business and Economics (2nd edition), Springer Berlin, 2015.
180. J. F. May, World Population Policies: Their Origin, Evolution, and Impact, Canadian
Studies in Population 39, No. 1 - 2 (Spring/Summer 2012):125 - 34, Dordrecht: Springer,
2012.
432
181. McKinsey&Company, Looking Ahead in Turbulent Times - Strategic Imperatives for Asset
Managers Going Forward, SFI Asset Management Education, R. Matthias, 2015.
182. McKinsex&Company, State of the Industry 2014/15 - a Perspective on Global Asset Management, SFI Asset Management Education, R. Matthias, 2015.
183. Melbourne Mercer Global Pension Index, Report, 2015.
184. R. C. Merton, Lifetime Portfolio Selection under Uncertainty: the Continuous-Time Case,
The Review of Economics and Statistics 51 (3): 247 - 257, 1969.
185. R. C. Merton, Optimum consumption and portfolio rules in a continuous-time model,
Journal of Economic Theory 3 (4): 373 - 413, 1971.
186. R. C. Merton, An Intertemporal Capital Asset Pricing Model, Econometrica: Journal of
the Econometric Society, 867-887, 1973.
187. R. C. Merton, On the Pricing of Corporate Debt: The Risk Structure of Interest Rates,
Journal of Finance, 29:449-470, 1974.
188. A. Meucci, Black - Litterman Approach, Encyclopedia of Quantitative Finance, Wiley
Finance, 2010.
189. A. Meucci, Fully Flexible Views: Theory and Practice, SSRN library, 2010b.
190. P. Milnes, The Top 50 Hedge Funds in the World, hedgethink.com, 2014.
191. T. J. Moskowitz, Y.H. Ooi, and L. H. Pedersen, Time series momentum, Journal of Financial Economics 104.2, 228-250, 2012.
192. A. H. Munnell, M.S. Rutledge and A. Webb, Are Retirees Falling Short? Reconciling the
Conflicting Evidence, Reconciling the Conflicting Evidence (November 2014). CRR WP
16, 2014.
193. A.H. Munnell and M. Soto, State and Local Pensions are Different from Private Plans,
Center for Retirement Research at Boston College, Number 1, November, 2007.
194. S. Nakamoto, Bitcoin: A Peer-to-Peer Electronic Cash System, 2008.
195. S.V. Nieuwerburgh and R.S.J. Kojen, Financial Economics, Return Predictability, and
Market Efficiency, University of Tilburg, Preprint, 2007.
196. R. Novy-Marx and J. D. Rauh, Policy Options for State Pension Systems and their Impact
on Plan Liabilities, Journal of Pension Economics and Finance 10.02: 173-194, 2011.
197. S. Pafka and I. Kondor, Estimated Correlation Matrices and Portfolio Optimization, Physica A, 343, 623-634, 2004.
198. A. Patton, T. Ramadorai and M. Streatfield. Change You Can Believe In? Hedge Fund
Data Revisions. Journal of Finance, 2013.
199. L. Pastor, R. F. Stambaugh and L. A. Taylor, Scale and Skill in Active Management,
Journal of Financial Economics, 2014
200. L. Pastor, and R. F. Stambaugh, Comparing Asset Pricing Models: An Investment Perspective, Journal of Financial Economics, 56, 335-381, 2000.
201. A. F. Perold and W. F. Sharpe, Dynamic Strategies for Asset Allocation, Financial Analyst
Journal, Jan, 16-27, 1988.
433
202. G. W. Peters, E. Panayi and A. Chapelle, Trends in Crypto-Currencies and Blockchain
Technologies: A Monetary Theory and Regulation Perspective, arXiv preprint, 2015.
203. E. Podkaminer, Risk Factors as Building Blocks for Portfolio Diversification: The Chemistry of Asset Allocation, Investment Risk and Performance, CFA Institute, 2013.
204. PriceWaterhouseCoupers, Asset Management 2020, A Brave New World, assetmanagement, 2014
205. E. Quian, A Mathematical and Empirical Analysis of Rebalancing Alpha, www.ssrn.com,
2014.
206. N. Rab and R. Warnung, Scaling Portfolio Volatility and Calculating Risk Contributions
in the Presence of Serial Cross-Correlations, arxiv.q-fin.RM, preprint, 2011.
207. M. Rabin, Risk Aversion and Expected-Utility Theory: A Calibration Theorem, Econometrica 68.5, 1281-1292, 2000.
208. T. Ramadorai, Capacity Constraints, Investor Information, and Hedge Fund Returns,
Journal of Financial Economics, 107(2), 401-416, 2013.
209. S. Ramaswamy, Market Structures and Systemic Risks of Exchange-Traded funds, BIS,
2011.
210. S.C. Rambaud, J.G. Perez, M.A. Granero and J.E. Segovia, Markowitz Model with Euclidian Vector Spaces, European Journal of Operational Research, 196, 1245-1248, 2009.
211. R. Rebonato and A. Denev, Portfolio Management under Stress: A Baysian Net Approach
to Coherent Asset Allocation, Cambridge University Press, Cambridge, 2013.
212. J. Rifkin, The Zero Marginal Cost Society: The Internet of Things, the Collaborative
Commons, and the Eclipse of Capitalism, Palgrave Macmillan Trade, 2014.
213. R. Roll, A Critique of the Asset Pricing Theorys Tests, Journal of Financial Economics
4: 129 - 176, 1977.
214. T. Roncalli, Introduction to Risk Parity and Budgeting, Chapman & Hall, Financial Mathematics Series, 2014.
215. S.A. Ross, The Arbitrage Theory of Capital asset Pricing, Journal of Economic Theory
13, 341 - 60, 1976.
216. S. Satchell and A. Scowcroft, A Demystification of the Black-Litterman Model: Managing
Quantitative and Traditional Portfolio Construction, Journal of Asset Management, Vol
1, 2, 138-150, 2000.
217. C.J. Savage, The foundation of statistics, Wiley, New York, 1954.
218. W. F. Sharpe, Capital asset prices: A theory of market equilibrium under conditions of
risk, Journal of Finance, 19 (3), 425-442, 1964.
219. B. Scherer, Portfolio Construction and Risk Budgeting, Third Edition, Risk Books, 2007.
220. SEC, Mutual Funds: A Guide for Investors, New York, 2008.
221. S. Schaefer, Factor Investing, Lecture at SFI Annual Meeting, 2015.
222. P. Schneider, Generalized Risk Premia, Journal of Financial Economics. forthcoming,
2015.
434
223. P. Schneider, C. Wagner and J. Zechner, Low Risk Anomalies, Preprint SFI, 2016.
224. J. Siegel, Stocks for the Long Run, McGraw-Hill, New York, NY, 1994.
225. R. J. Shiller, The Uses of Volatility Measures in Assessing Market Efficiency, Journal of
Finance 36: 291 - 304, 1981.
226. R. J. Shiller, From Efficient Markets Theory to Behavioral Finance, Journal of Economic,
Perspectives 17 (1): 83 - 104, 2003.
227. R. J. Shiller, Speculative Asset Prices, Cowles Foundation Paper No. 1424, 2014.
228. R. J. Shiller, Market Efficiency and Role of Finance in Society, Key Note Lecture, EFA
2014, Lugano, 2014.
229. R. J. Shiller and A.N. Weiss, Home Equity Insurance, The Journal of Real Estate Finance
and Economics, 19(1): 21-47, 1999.
230. State Street, The Folklore of Finance, Center of Applied Research. 2014.
231. G.V.G. Stevens, On the Inverse of the Covariance Matrix in Portfolio Analysis, The Journal
of Finance, Vol. 53(5), 1821-1827, 1998.
232. R.Sullivan, A. Timmermann, and H. White, Data-snooping, Technical Trading Rule Performance , and the Bootstrap, The Journal of Finance 54 (5), 1647 - 1691, 1999.
233. M. Swan, Blockchain: Blueprint for a New Economy, OReilly Media, 2015.
234. J. Syz, M. Salvi and P. Vanini, Property Derivatives and Index-Linked Mortgages, Journal
of Real Estate Finance and Economics, Vol. 36, No. 1, 2008.
235. J. Syz and P. Vanini, Real Estate, Swiss Finance Institute Annual Meeting, 2008.
236. N. Sullivan, A (Relatively Easy To Understand) Primer on Elliptic Curve Cryptography,
Cloudfare blog, 2013.
237. N. Szabo, Formalizing and Securing Relationships on Public Networks, First Monday, 2(9),
1997.
238. P. Tasca, Economic Foundation of the Bitcoin Economy, University College London, Center
for Blockchain Technologies, Blockchain Workshop Zurich, 2016.
239. N. Taleb, The Black Swan. The Impact of the Highly Improbable. New York: Random
House, 2010.
240. J. Teiletche, Risk-Based Investing: Myths and Realities, CFA UK Masterclass, London
June 9th, 2015.
241. J. Teiletche, Active Risk-Based Investing, CQ Asia, Hong Kong, 2014.
242. M. Teo, The Liquidity Risk of Liquid Hedge Funds, Journal of Financial Economics, 100(1),
24-44, 2011.
243. J. Ter Horst and M. Verbeek, Fund Liquidation, Self-Selection, and Look-Ahead Bias in
the Hedge Fund Industry, Review of Finance, 11(4), 605-632, 2007.
244. J. Treynor and K. Mazuy, Can Mutual Funds Outguess the Market, Harvard business
review, 44(4), 131-136, 1966.
435
245. F. Trojani and P. Vanini, A Note on Robustness in Mertons Model of Intertemporal
Consumption and Portfolio Choice, Journal of Economic Dynamics and Control, Vol. 26,
No. 3, 423-435, 2002.
246. Tu and Zhou, Data-Generating Process Uncertainty, What Difference Does it Make in
Portfolio Decisions?, Journal of Financial Economics, 72, 385-421, 2003.
247. UBS, Strategy and Regulation. Impact of Regulation on Strategy and Execution, SFI
Conference on Managing International Asset Management, N. Karrer, 2015.
248. UBS, Distribution Strategies in Action, SFI Conference on Managing International Asset
Management, A. Benz, 2015.
249. L. Vignola and P. Vanini, Optimal Decision-Making with Time Diversification, Review of
Finance, 6.1, 1-30, 2002.
250. I. Walter, The Asset Management Industry Dynamics of Growth, Structure and Performance , edited By Michael Pinedo and Ingo Walter, 2013.
251. J.H. White, Volatility Harvesting: Extracting Return from Randomness, arXiv, November,
2015.
252. World Economic Forum, The Future of Long-term Investing, New York, 2011.
253. World Economic Forum, Future or Financial Services, New York, 2015.
254. A. Zelltner and V.K. Chetty, Prediction and Decision Problems in Regression Models from
the Baysian Point of View, Journal of the American Statistical Association, 60, 608-616,
1965.
255. ZKB, Index Methoden, internes Dokument, 2013.
256. H. Zou, The Adaptive LASSO and its Oracle Properties, Journal of the American Statistical Association 101(476), 14181429, 2006.
257. G. Zyskind, N. Oz and A. Pentland, Enigma: Decentralized Computation Platform with
Guaranteed Privacy, arXiv preprint, 2015.
Index
Diversification of Minds , 74
Herfindahl Index, 69
Needed Investment Amount, 68
Risk Sclaing, Square-root Rule, 76
Shannon Entropy , 69
Tasche Index, 69
Time Varying Dependence, 67
Two Statistical Propositions, 66
Money-Weighted Rate of Return (MWR),

35
Active versus Passive
Sharpes Arithmetics , 79
Arithmetical Relative Return (ARR), 33
Asset Class
Definition, 26
Asset Decumulation, 5
Asset Management Industry
Expectations 2020, 12
Overview 2002-2015, 11
Wealth 2020, 47
Asset Management Overwiev, 20
Asset Pricing
Equilibrium, 10
Factor Investing, 10
Optimal Portfolio, 10
Average Investment Capital (AIC), 36
Effective Rate, 30
Ellsberg Paradoxon, 89
Fee Models, 6
Fees
Fee Overview, 13
Game Changers, 5
Greece and EU Uncertainty, 91
Growth of Wealth, 5
Growth Rates AuM, 47
Bayesian Approach to Estimation Risk, 93

Benchmark Return, 33
Benchmarking, 88
Black - Litterman, 9
Brinson-Hood-Beebower (BHB) Effect, 33
CAPM, 8
Compounding, 30
Hedge Funds
Hedge Funds Overview, 12
Heuristic Models, 89
Long Run Return and Risk, 61
Longevity and Demographics, 5
Luck and Skill, 12
Macro Economic Uncertainty, 90

Dietz Return, 36
Markowitz Model, 8
Discount Factor, 28
Mean-Variance Problem
Diversification, 62
Introduction, 86
Asset Allocation Europe, 73
Conservative, Balanced, Dynamic, Growth Mean-Surplus, 87
MiFID II, 7
Portfolios, 63
Mutual Funds
Costs and Performance, 78
Different Portfolio Constructions , 71
Mutual Funds vs. UCITS, 12
436
INDEX
437
UCITS, 12
Relative Asset Pricing

APT, 10
No Arbitrage Condition, 28
No Arbitrage, 10
Normalized Portfolio, 32
Return and Leverage, 36
Risk Parity, 8
Parameter Uncertainty, Estimation Risk, 91 Risk Preferences, 84
Pension Funds
DB versus DC, 24
SAA, TAA, 42
Defined Benefit (DB), 22, 24
Self-Financing Strategy, 32
Defined Contribution (DC), 22, 24
Shrinkage Rule, 95
Longevity and Fertility, 23
Sovereigns Wealth Funds (SWFs), 21
Management, 37
Statistical Models, 81
Overview, 22
Technology
SAA, TAA, 38
Big Data, 6
TAA and SAA, 43
FinTech, 6
Technical Interest Rates, 26
FinTech and Big Data Summary, 14
Three Pillar System, 22
Time
Value of Money, 28
Underfunding, 24
Time-Weighted Rate of Return (TWR), 35
Performance Attribution Tree, 34
Portfolio, 32
Portfolio Construction
Overview, 8
Predictability
Definition, 38
Forecast Regression, 40
Martingale, 39
Return Predictability, 41
Projections AuM 2020, 47
Real Estate, 10
Regulation
CIO Investment Process, 55
Client Segmentation, 52
Conduct Risk, 58
Fines in UK, 59
FinTech and Big Data, 56
Hedge Fund Disclosure, 59
Impact Swiss Banking Industry, 50
Intermediation Channel Segmentation,
51
Mandate Solutions, 57
MiFID II, 50
Overview, 48
Product Suitability, 54
Uncertainty, 9
Wealth of Nations, 46
Yield-to-Maturity (YtM), 29

Asset Management

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Asset Management

Transféré par

Droits d'auteur :

Formats disponibles

Asset Management

October 22, 2016

2.6.4 A First Step toward Passive versus Active Investment .

Investment Theory Synthesis

Global Asset Management

Capital Weighted Index Funds . . . . . . . . . . . . . . . . . . .

Introduction and Summary

CHAPTER 1. INTRODUCTION AND SUMMARY

Regulation and Technology

1.3. FUNDAMENTAL ISSUES IN AM

Technology is irreversible while regulation is not. Regulators could revoke any

CHAPTER 1. INTRODUCTION AND SUMMARY

1.4. INVESTMENT THEORY SYNTHESIS

Investment Theory Synthesis

CHAPTER 1. INTRODUCTION AND SUMMARY

the need to understand whether a particular investment approach is sustainable. This

1.5. GLOBAL ASSET MANAGEMENT INDUSTRY

Global Asset Management Industry

CHAPTER 1. INTRODUCTION AND SUMMARY

2012, USD tr.

2020, USD tr.

Growth rate p.a.

Table 1.1: Expected AuM growth until 2020 ((PwC [2014]).

1.5. GLOBAL ASSET MANAGEMENT INDUSTRY

CHAPTER 1. INTRODUCTION AND SUMMARY

CHAPTER 1. INTRODUCTION AND SUMMARY

Time Value of Money

If we consider short time periods such

This is one reason why continuous compounding is preferred.

Interest term structure is flat, i.e. YtM is not a market rate.

Returns and Return Attribution

Consider a finite discrete time model 0, 1, 2, . . . , T , B0 , the risk-less asset, normalized

j Sj (t) =: h(t), S(t)i ,

The following properties (portfolio accounting) are immediate to check:

1. The normalized portfolio components without leverage adds up

We recall the definition of an inner product:

Definition 2.1.1. Let X be a vector space. A map h., .i : X X R is an inner product if it is a

Fees: net of fee return

j=1 i,j Si+1,j

is the value of the N assets with the corresponding

portfolio . The following properties holds for TWR:

1. Adding or subtracting any cash flow ct at any time t does not

Returns and Leverage

with = (1 , 2 ) and 3 = 1 2 . If there is no leverage, 3 = 0 follows. Calculating

Sovereign Wealth Funds (SWFs)

The above transformation of pension funds from DB to DC plans will be challenged

Management of Pension Funds

shows that martingales are not predictable.

A mathematical proposition states that p satisfies the following dynamics:

desired level is 98%!

is random walk. Hence,

Example Leverage of private investors

The Efficient Market Hypothesis (EMH)

2.3. THE EFFICIENT MARKET HYPOTHESIS (EMH)

Unfortunately, while intuitively meaningful, the statement regarding reflecting all

2.3. THE EFFICIENT MARKET HYPOTHESIS (EMH)

E(St+1 |Ft ) + E(Dt+1 |Ft )

2.3. THE EFFICIENT MARKET HYPOTHESIS (EMH)

Center for Research in Security Prices at Chicago Booth business school.

2.3. THE EFFICIENT MARKET HYPOTHESIS (EMH)

replaces the return in

2.3. THE EFFICIENT MARKET HYPOTHESIS (EMH)

Importance of EMH for Asset Management

2.3. THE EFFICIENT MARKET HYPOTHESIS (EMH)