Vous êtes sur la page 1sur 9

No, Small Probabilities Are Not Attractive to Sell: A Comment

Nassim N. Taleb VERY PRELIMINARY DRAFT FOR DISCUSSION SUBMISSION),CANNOT BE CITED (NOT

Owing to the convexity of the payoff of out-of-the money options, an extremely small probability of a large deviation unseen in past data justifies rationally buying them, or at least justifies excessive caution in not being exposed to them, particularly those options that are extremely nonlinear in response to market movement. On needs, for example, a minimum of 2000 years of stock market data to assert that some tail options are expensive. The paper starts by discussing errors in Ilmanen (2012), which provides an exhaustive list of all arguments in favor of selling insurance on small probability events. The paper goes beyond a discussion of Ilmanen (2012) and presents an approach to analyze the payoff and risks of options based on the nonlinearities in the tails and heuristics to measure fragility of exposures.

Background
To the question, Do Financial Markets Reward Buying or Selling Insurance and Lottery Tickets? Ilmanen (2012) offers a negative answer to buying, seemingly backed by a great deal of "empirical" examination, and citing a large number of studies. And he concludes that investors should not be insured, or should perhaps sell such insurance. Perhaps too many papers and arguments for comfort. Just as in a complicated detective novel, where it is often the character with the largest number of alibi who turns out to the the murderer, the enumeration of back-up arguments fails to mask that a central element is missing in the paper: nonlinear effects and asymmetries. Just adding these nonlinear responses of tail payoffs does more than reverse the result. And the paper also misses asymmetries in reasoning required for decision-making and statistical decidability. But Ilmanen (2012) is priceless because it includes a review-study of all supporting arguments against the purchase of small-probability events and it turns out that there is not a single convincing study to that effect.

Problems Proper to Ilmanen(2012)


More specifically, the elephant in the room in Ilmanen(2012) is in the two large misses: Problem 1-A) The paper excludes the stock market crash of 1987 from the analysis. Problem 1-B) The paper ignores the disastrous effect of bank loans (small probability selling) from the 2008 debacle (and those of the 1991 and 1982 credit problems) while citing explicitly Taleb(2004), a paper that includes them as a domain of tail selling. The losses of 2008, as estimated by the IMF at ~$5 trillion (before the government transfers and bailouts) would offset every single gain from tail selling in the history of economics. Excluding the crash of 1987 and bank loans would be like claiming that the 20th century was extremely peaceful by excluding the first and second world war. These two fallacies on their own would be devastating for the entire idea. The very addition of these events invalidates the entire conclusion of the paper, but let us examine some more errors related to nonlinearities. Problem 2-A) The paper makes the severe error of ignoring the effect of Jensens inequality from the nonlinearity of the difference between VIX and delivered volatility (the VIX, by design, delivers a payoff that is closer to the variance swap. Comparing the VIX to delivered volatility is erroneous: say the VIX was bought at 10% (that is the component options are purchased at a combination of volatilities that corresponds to a VIX at that level), because of Jensens inequality, it could benefit from an episode of 4% volatility followed by an episode of 14%, for an average of 9.5% while Ilmanen seems to treat this .5% gap as a loss). Problem 2-B) Using VIX is inappropriate to gauge small probabilities. The VIX is not quite representative of the tails; its value is dominated by at-the-money options, and the fact that at-the-money options can be expensive has no bearing on the argument since we are concerned with the tails (this author, when betting on fat-tailedness, tends to sell at the money options as one can safely say, in agreement with Ilmanen (2012) that, owing to their linearity, that they are patently expensive; and such a statement is robust). The paper has other related confusions from missing the nonlinear nature of the payoffs, such as: Problem 2-C) Citing the phenomenon called "long shot bias" while referring to papers on bounded payoffs and payoffs of a binary nature in unrelated domains (what Taleb (2007) calls the "ludic fallacy"). These packages in papers cited such as Golec and Tamarkin (1998) and Snowberg and Wolfers (2010) are not sensitive to fat-tails (there are no true exposure to extreme payoffs). For a discussion of the problem see a short note by the author (Taleb, 2012). and Problem 2-D) The study conflates long volatility trading (a more or less convex strategy) with investment in high or low volatility stocks and uses the successful historical performance from investing in low volatility stocks to back up the idea of selling tail volatility. Finally, we go beyond Ilmanen(2012) to present a general approach in deriving conclusions from a time series. Even if the Ilmanen paper were devoid of the previous problems, a more general way to look at risk from times series would invalidate it as follows.

Ilmanen.nb

Errors Proper to General Inference From Time Series


A common error is made in the interpretation of time series by confusing an estimated value for a true parameter, without establishing the link between estimator from the realization of a statistical process and the estimated value. Take M the true mean of a distribution in the statistical sense, with M* the observed arithmetic mean of the realizations N. N Note that scientific claims can only be made off M, not M*. Limiting the discussion to M* is mere journalistic reporting. Mistaking M* for M is what the author calls the Pinker mistake of calling journalistic claims empirical evidence, as science and scientific explanations are about theories and interpretation of claims that hold outside a sample, not discussions that limit to the sample. Assume we satisfy standard statistical convergence where our estimator approaches the true mean (convergence in probability, that is, limit
N

P[ |M* -M|>e]=0). The problem is that for a finite M, when all realizations of the process when the distribution is skewed-left, say bounded by 0 N on one side, and is monomodal, E[M* ] > M. Simply, returns in the far left tail are less likely to show up in the distribution, while these have N large contributions to the first moment. {NOTE TO EDITOR: CAN THE JOURNAL READERS HANDLE THEOREMS? OTHERWISE WE USE WORDS FOR ABOVE BUT A THEOREM CAN PROVE WHY PRE-ASYMPTOTETICALLY E[M* ] > M FOR NEGATIVELY SKEWED PROCESSES. } N A simple way to see the point: the study assumes that one can derive strong conclusions from a single historical path not taking into account sensitivity to counterfactuals and completeness of sampling. It assumes that what one sees from a time series is the entire story.
Probability

Where data tend to be missing

Outcomes
Figure 1: The Small Sample Effect and Naive Empiricism: When one looks at historical returns that are skewed to the left, most missing observations are in the left tails, causing an overestimation of the mean. The more skewed the payoff, and the thicker the left tail, the worst the gap between observed and true mean.

Now of concern for us is assessing the stub, or tail bias, that is, the difference between M and M*, or the potential contribution of tail events not seen in the window used for the analysis. When the payoff in the tails is powerful from convex responses, the stub becomes extremely large. So the rest of this note will go beyond the Ilmanen (2012) to explain the convexities of the payoffs in the tails and generalize to classical mistakes of testing strategies with explosive tail exposures on a finite simple historical sample. It will be based on the idea of metaprobability (or metamodel): by looking at effects of errors in models and representations. All one needs is an argument for a very small probability of a large payoff in the tail (devastating for the option seller) to reverse long shot arguments and make it uneconomic to sell a tail option. All it takes is a small model error to reverse the argument.

The Nonlineatities of Option Packages


There is a compounding effect of rarety of tail events and highly convex payoff when they happen, a convexity that is generally missed in the literature. To illustrate the point, we construct a "return on theta" (or return on time-decay) metric for a delta-neutral package of an option, seen at t0 o given a deviation of magnitude N sK . P HN, KL 1 qS0 ,t0 , d JO JS0 eN sK
d

, K, T - t0 , sK N - O HS0 , K, T - t0 - d, sK L - DS0 ,t0 H1 - S0 L eN sK

(1)

Where 0 HS0 , K, T - t0 - d, sK Lis the European option price valued at time t0 off an initial asset value S0 , with a strike price K, a final expiration at time T, and priced using an implied standard deviation sK . The payoff of P is the same whether O is a put or a call, owing to t, owing to the delta-neutrality using DS0 ,t0 (thanks to put-call parity). qS0 ,t0 is the discrete change in value of the option over a time increment d (changes of value for an option in the absence of changes in any other variable). With the increment d= , this would be a single business 252 day. We assumed interest rate are 0, with no loss of generality (it would be equivalent of expressing the problem under a risk-neutral measure). What Eq (1) did is re-express the Fokker-Plank-Kolmogorov differential equation (Black Scholes),
1 1

f = t

1 2

S2 s 2

from the limit of d 0. In the standard Black-Scholes World, the expectation of P(N,K) should be zero, as N follows a Gaussian distribution with mean - s2 . But we are not about the Black Scholes world and we need to examine payoffs to potential distributions. The use of sK neutral2 izes the effect of "expensive" for the option as we will be using a multiple of sK as N standard deviations; if the option is priced at 15.87% volatility, then one standard deviation would correspond to a move of about 1%, Exp[
1 252

2 f S2

in discrete terms, away

. 1587].

Where 0 HS0 , K, T - t0 - d, sK Lis the European option price valued at time t0 off an initial asset value S0 , with a strike price K, a final expiration at time T, and priced using an implied standard deviation sK . The payoff of P is the same whether O is a put or a call, owing to t, Ilmanen.nb owing to the delta-neutrality using DS0 ,t0 (thanks to put-call parity). qS0 ,t0 is the discrete change in value of the option over a time increment3 d (changes of value for an option in the absence of changes in any other variable). With the increment d= , this would be a single business 252 day. We assumed interest rate are 0, with no loss of generality (it would be equivalent of expressing the problem under a risk-neutral measure). What Eq (1) did is re-express the Fokker-Plank-Kolmogorov differential equation (Black Scholes),
1 1

f = t

1 2

S2 s 2

from the limit of d 0. In the standard Black-Scholes World, the expectation of P(N,K) should be zero, as N follows a Gaussian distribution with mean - s2 . But we are not about the Black Scholes world and we need to examine payoffs to potential distributions. The use of sK neutral2 izes the effect of "expensive" for the option as we will be using a multiple of sK as N standard deviations; if the option is priced at 15.87% volatility, then one standard deviation would correspond to a move of about 1%, Exp[
2 p 1 252

2 f S2

in discrete terms, away

. 1587].

Clearly, for all K, P[0,K]=-1 , P[

,K]= 0 close to expiration (the break-even of the option without time premium, or when T - t0 = d,

takes place one mean deviation away), and P[ 1,K]= 0.

Convexity and Explosive Payoffs


Of concern to us is the explosive nonlinearity in the tails. Let us examine the payoff of P across many values of K= S0 Exp[L sK d ], in other words how many sigmas away from the money the strike is positioned. A package about 20 s out of the money , that is, L=20, the crash of 1987 would have returned 229,000 days of decay, compensating for > 900 years of wasting premium waiting for the result. An equivalent reasoning could be made for subprime loans. From this we can assert that we need a minimum of 900 years of data to start pronouncing these options 20 standard deviations out-of-the money expensive, in order to match the frequency that would deliver a payoff, and, more than 2000 years of data to make conservative claims. Clearly as we can see with L=0, the payoff is so linear that there is no hidden tail effect.

8000

PHNL
L = 20

6000

4000 L = 10 2000 L=0 5 10 15

N
20

Figure 2: Returns for package P(N,K= S0 Exp[L sK ] ) at values of L= 0,10,20 and N, the conditional sigma deviations.

Ilmanen.nb

400 000

300 000

200 000

100 000

10

15

20

Figure 3: The extreme convexity of an extremely out of the money option, with L=20

Visibly the convexity is compounded by the fat-tailedness of the process: intuitively a convex transformation of a fat-tailed process, say a powerlaw, produces a powerlaw of considerably fatter tails. The Variance swap for instance results in the underlying security, so it would have infinite variance with tail
3 2 1 the 2

tail exponent of the distribution of

off the "cubic" exonent discussed in the literature (Gabaix et al,2003;


1

Stanley et al, 2000) -and some out-of-the money options are more convex than variance swaps, producing tail equivalent of up to over a 5 broad range of fluctuations. For specific options there may not be an exact convex transformation. But we can get a Monte Carlo simulation illustrating the shape of the distribution and visually showing how sjewed it is.

Figure 4: In probability space. Histogram of the distribution of the returns L=10 using powerlaw returns for underlying distribution with a tail exponent =3. TEMPORARY GRAPH WITH 104 RUNS, EXPECTING FULL GRAPH >106 runs.

Footnote 1: This convexity effect can be mitigated by some dynamic hedges, assuming no gaps but, because of local time for stochastic processes, in fact, some smaller deviations can carry the cost of larger ones: for a move of -10 sigmas followed by an upmove of 5 sigmas revision can end up costing a lot more than a mere -5 sigmas. Tail events can come from a volatile sample path snapping back and forth.

Fragility Heuristic
Most of the losses from option portfolios tend to take place from the explosion of implied volatility, therefore acting as if the market had already experienced a tail event (say in 2008). The same result as Figure 3 can be seen for changes in implied volatility: an explosion of volatiliy 5 results in a 10 s option gaining 270 (the VIX went up > 10 during 2008). (In a well publicized debacle, the speculator Niederhoffer went bust because of explosive changes in implied volatility in his option portfolio, not from market movement; further, the options that bankrupted his fund ended up expiring worthless weeks later). The Taleb and Douady (2012), Taleb Canetti et al (2012) fragility heuristic identifies convexity to significant parameters as a metric to assess fragility to model error or representation: by theorem, model error maps directly to nonlinearity of parameters. The heuristic corresponds to the perturbation of a parameter, say the scale of a probability distribution and looks at the effect of the expected shortfall; the same theorem asserts that the asymmetry between gain and losses (convexity) maps directly to the exposure to model error and to fragility. The exercise allows us to re-express the idea of convexity of payoff by ranking effects.

Ilmanen.nb

The Taleb and Douady (2012), Taleb Canetti et al (2012) fragility heuristic identifies convexity to significant parameters as a metric to assess fragility to model error or representation: by theorem, model error maps directly to nonlinearity of parameters. The heuristic corresponds to the perturbation of a parameter, say the scale of a probability distribution and looks at the effect of the expected shortfall; the same theorem asserts that the asymmetry between gain and losses (convexity) maps directly to the exposure to model error and to fragility. The exercise allows us to re-express the idea of convexity of payoff by ranking effects. ATM L=5 L=10 L=20 2 2 5 27 7686 3 3 10 79 72 741 4 4 16 143 208 429

The Table presents differents results (in terms of multiples of option premia over intrinsic value) by multiplying implied volatility by 2, 3,4. An option 5 conditional standard deviations out of the money gains 16 times its value when implied volatility is multiplied by 4. Further out of the money options gain exponentially. Note the linearity of at-the-money options

Conclusion: The Asymmetry in Decision Making


To assert overpricing (or refute underpricing) of tail events expressed by convex instruments requires an extraordinary amount of evidence, a much longer time series about the process and strong assumptions about temporal homogeneity. Out of the money options are so convex to events that a single crash (say every 50, 100, 200, even 900 years) could be sufficient to justify skepticism about selling some of them (or avoiding to sell them) --those whose convexity matches the frequency of the rare event. The further out in the tails, the less claims one can make about their "value", state of being "expensive', etc. One can make claims on "bounded" variables perhaps, not for the tails. To repeat, this error of claiming results from small samples to matters subjected to shocks is not empiricism, rather it violates every standard of statistical method. Sadly nobody makes claims of irrationality for financing an army, a police force, homeland security, extra safety for airplanes or other expensive hedges on empirical evidence that there have been no world wars since 1945. When it comes to investments, it appears that lack of skin in the game of researchers is behind all manners of violation of both statistical theory and common sense. For, in the end, we use statistics for decisions. There are very large asymmetries in tail selling, with mistakes disproportionately costlier in the tails. The research on overpricing of the type of Ilmanen(2012) is completely unreliable, and there are some errors people feel are not worth making.

Acknowledgments
Bruno Dupire, Xavier Gabaix, Mark Spitznagel, Michael Mauboussin, Phil Maymin, Aaron Brown

References
Ilmanen, Antti, 2012, Do Financial Markets Reward Buying or Selling Insurance and Lottery Tickets? Financial Analysts Journal, September/October, Vol. 68, No. 5 : 26 - 36. Gabaix, X., P. Gopikrishnan, V.Plerou & H.E. Stanley, 2003, A theory of power - law distributions in financial market fluctuations, Nature, 423, 267 - 270. Golec, Joseph, and Maurry Tamarkin. 1998. Bettors Love Skewness, Not Risk, at the Horse Track. Journal of Political Economy, vol. 106, no. 1 (February) : 205225. Snowberg, Erik, and Justin Wolfers. 2010. Explaining the Favorite - Longshot Bias : Is It Risk - Love or Misperceptions? Working paper (April). Stanley, H.E. L.A.N. Amaral, P. Gopikrishnan, and V. Plerou, 2000, Scale invariance and universality of economic fluctuations, Physica A, 283, 31 - 41 Taleb, N.N., Bleed or Blowup? Why Do We Prefer Asymmetric Payoffs? Journal of Behavioral Finance, vol. 5, no. 1:27. Taleb, N.N., 2012, "A Short Note to Explain Why "Prediction Markets" & Game Setups Have Nothing to Do With Real World Exposures." Technical Companion for the Incerto, www.fooledbyrandomness.com Taleb, N.N., 2012, What Can Social Scientists Learn From the Pinker Mistake? in preparation.

A Short Note to Explain Why Prediction Markets & Game Setups Have Nothing to Do With Real World Exposures

Ilmanen.nb

A Short Note to Explain Why Prediction Markets & Game Setups Have Nothing to Do With Real World Exposures
A SHORT EXPLANATORY TECHNICAL NOTE, NOT A PAPER

Nassim N. Taleb
NYU-Poly
November 2012

Abstract
This explains how and where prediction markets (or, more general discussions of betting matters) do not correspond to reality and have little to do with exposures to fat tails and Black Swan effects. Elementary facts, but with implications. This show show, for instance, the long shot bias is misapplied in real life variables, why political predictions are more robut than economic ones. This discussion is based on Taleb (1997) showing the difference between a binary and a vanilla option.

Definitions
A binary bet (or just a binary or a digital): a outcome with payoff 0 or 1 (or, yes/no, -1,1, etc.) Example: prediction market, election, most games and lottery tickets. Also called digital. Any statistic based on YES/NO switch. Binaries are effectively bets on probability. They are rarely ecological, except for political predictions. (More technically, they are mapped by the Heaviside function.) A exposure or vanilla: an outcome with no open limit: say revenues, market crash, casualties from war, success, growth, inflation, epidemics... in other words, about everything. Exposures are generally expectations, or the arithmetic mean, never bets on probability, rather the pair probability payoff A bounded exposure: an exposure(vanilla) with an upper and lower bound: say an insurance policy with a cap, or a lottery ticket. When the boundary is close, it approaches a binary bet in properties. When the boundary is remote (and unknown), it can be treated like a pure exposure. The idea of clipping tails of exposures transforms them into such a category.

The Problem
The properties of binaries diverge from those of vanilla exposures. This note is to show how conflation of the two takes place: prediction markets, ludic fallacy (using the world of games to apply to real life), 1. They have diametrically opposite responses to skewness. 2. They repond differently to fat-tailedness (sometimes in opposite directions). Fat tails makes binaries more tractable. 3. Rise in complexity lowers the value of the binary and increases that of the exposure.

Some direct applications: 1- Studies of long shot biases that typically apply to binaries should not port to vanillas. 2- Many are surprised that I find many econometricians total charlatans, while Nate Silver to be immune to my problem. This explains why. 3- Why prediction markets provide very limited information outside specific domains. 4- Etc.

The Elementary Betting Mistake


One can hold beliefs that a variable can go lower yet bet that it is going higher. Simply, the digital and the vanilla diverge. PHX > X0 L> , but 2 EHXL < EHX0 L. This is normal in the presence of skewness and extremely common with economic variables. Philosophers have a related problem called the lottery paradox which in statistical terms is not a paradox.
1

Ilmanen.nb
1

One can hold beliefs that a variable can go lower yet bet that it is going higher. Simply, the digital and the vanilla diverge. PHX > X0 L> , but 2 EHXL < EHX0 L. This is normal in the presence of skewness and extremely common with economic variables. Philosophers have a related problem called the lottery paradox which in statistical terms is not a paradox.

The Elementary Fat Tails Mistake


A slightly more difficult problem. When I ask economists or social scientists, what happens to the probability of a deviation >1s when you fatten the tail (while preserving other properties)?, almost all answer: it increases (so far all have made the mistake). Wrong. They miss the idea that fat tails is the contribution of the extreme events to the total properties, and that it is a pair probability payoff that matters, not just probability. Ive asked variants of the same question. The Gaussian distribution spends 68.2% of the time between 1 standard deviation. The real world has fat tails. In finance, how much time do stocks spend between 1 standard deviations? The answer has been invariably lower. Why? Because there are more deviations. Sorry, there are fewer deviations: stocks spend between 78% and 98% between 1 standard deviations (computed from past samples). Some simple derivations: Let x follow a Gaussian distribution (m , s). Assume m=0 for the exercise. What is the probability of exceeding one standard deviation? P>1 s = 1 1 2

erfcK-

1 2

O , where erfc is the complimentary error function, P>1 s = P< 1 s >15.86% and the probability of

staying within the stability tunnel between 1 s is > 68.2 %. Let us fatten the tail, using a standard method of linear combination of two Gaussians with two standard deviations separated by s 1 + a and s 1 - a , where a is the "vvol" (which is variance preserving, a technically of no big effect here, as a standard deviation-preserving spreading gives the same qualitative result). Such a method leads to immediate raising of the Kurtosis by a factor of I1 + a2 M since
E Ix4 M E Ix2 M
2

= 3 Ha2 + 1L P>1 s = P< 1 s = 1 1 2 erfc 2 1 1-a 1 2 erfc 2 1 a+1

So then, for different values of a as we can see, the probability of staying inside 1 sigma increases.
0.6

0.5

0.4

0.3

0.2

0.1

-4

-2

Fatter and fatter tails: different values of a. We notice that higher peak lower probability of nothing leaving the 1 s tunnel

The Event Timing Mistake


Fatter tails increases time spent between deviations, giving the illusion of absence of volatility when in fact events are delayed and made worse (my critique of the Great Moderation). Stopping Time & Fattening of the tails of a Brownian Motion: Consider the distribution of the time it takes for a continuously monitored Brownian motion S to exit from a "tunnel" with a lower bound L and an upper bound H. Counterintuitively, fatter tails makes an exit (at some sigma) take longer. You are likely to spend more time inside the tunnel --since exits are far more dramatic. y is the distribution of exit time t, where t inf {t: S [L,H]} From Taleb (1997) we have the following approximation

yH t sL =

1 HlogHHL - logHLLL
2

- 8 It s M p s2

Ilmanen.nb

n=1

1 H 1 2 L

H-1Ln

n2 p2 t s2 2 HlogHHL-logHLLL2

L sin

n p HlogHLL - logHSLL logHHL - logHLL

H sin

n p HlogHHL - logHSLL logHHL - logHLL

and the fatter-tailed distribution from mixing Brownians with s separared by a coefficient a:

yHt s, aL =

pH t s H1 - aLL +

1 2

pHt s H1 + aLL

This graph shows the lengthening of the stopping time between events coming from fatter tails.
Probability 0.5 0.4 0.3 0.2 0.1 2 4 6 8 Exit Time

Expected t 8 7 6 5 4 3 0.1 0.2 0.3 0.4 0.5 0.6 0.7 v

More Complicated : MetaProbabilities (TK)


{this is a note. a more advanced discussion explains why more uncertain mean (vanilla) might mean less uncertain probability (prediction), etc. Also see the Why We Dont Know What We Are Talking About When We Talk About Probability.

Graphical Representation

Ilmanen.nb

Vous aimerez peut-être aussi