Probability Theory

A primer on probability theory in nancial modeling
Sergio M. Focardi Tel: +33 1/45 75 51 74 interteksf@aol.com The Intertek Group 94, rue de Javel F-75015 Paris Tutorial 2001-01 The objective of nance theory is to predict the future evolution of nancial quantities, such as the price of a single asset or broad market movements. Uncertainty as to the future evolution of prices is a fundamental tenet of modern nance theory. The paradigm of choice for modeling uncertainty in nance is the probability theory. This tutorial presents the formal probabilistic concepts behind todays nancial modeling.
Introduction
The theory of nance is a mathematical theory that describes the time evolution of nancial quantities. To describe the time evolution of quantities, nance theory uses a mathematical formalism similar to that of the physical sciences. The results - either actual or possible - of empirical observations can be predicted through a sequence of purely logical operations. Given todays low-cost high-performance computers, predictions are generally obtained by running mathematical models on computers. As in the physical sciences, the objective of nance theory is to make predictions. Predictions might concern phenomena such as the future value of a single stock or currency or future market movements in a given geographical block. But there are important dierences between nance theory and theory in the physical sciences: Finance theory does not describe laws of nature but a complex human artifact, the nancial markets. Physical sciences that describe a complex system, such as weather, are typically supported by basic physical laws; nance theory describes a complex system for which there is (presently) no description in terms of elementary laws or elementary components.
Finance theory lacks a mathematical theory that allows to compute the evolution of a system (in this case of a nancial system) starting from initial conditions. In the absence of this, nance theory consists of two separate components: 1) a set of relationships that constrain the entire market from the time evolution of fundamental determinants and 2) a set of assumptions on the time evolution of fundamental determinants. The determination of the appropriate assumptions as regards the evolution of fundamental determinants is more the domain of nancial econometrics than of nance theory. The above considerations are central to an understanding of nancial modeling. In fact, derivatives asset pricing models are (generally) algorithms that compute the price evolution of derivatives starting from assumptions on the evolution of the price of the underlying. The choice of models must take into account the ability of the model to describe not only the underlying (such as the term structure of interest rates) but also the functioning of markets. Lets now briey present the development of nance theory. 1.1 Competitive markets under uncertainty
During the 1950s, the economists Kenneth Arrow and Georges Debreu proposed extending to nancial markets the notions of microeconomics and, consequently, the analysis of competitive markets. A classical reference on microeconomics is Varian (1992). Lets review the fundamental points. Following classical notions, a competitive exchange market (without production) is a tool that allows N agents to exchange a given good. The fundamental notion of competitive exchange is that of supply and demand. It is assumed that each agent is characterized by a demand function which prescribes the quantity of a good that the agent is willing to buy at a given price. The aggregation of demand produces the market demand function, i.e., the total amount of the good that agents will purchase at any given price. There is a parallel function of supply which associates to each price the quantity of the good oered for sale. The supply function derives from the aggregation of the individual supply function of agents. The classical analysis makes the assumption of perfect competitive markets. Perfect competitive markets have a number of characteristics. First, there are no costs or constraints associated with the exchange of goods; price and quantity are the only determinants of exchange. Each agent is individually too small to inuence the market; prices are determined by the collective action of agents. Given the market price, each agent will buy or sell exactly the quantity prescribed by its demand or supply function, under the constraint of their nancial endowment which prescribes the maximum total amount that each can purchase. If there is not one but several goods to choose from, agents must select from a
panel of goods. It is assumed that agents are able to order their preferences, i.e., that they are able to decide if they are either indierent to or have preferences for dierent panels of goods. It is possible to demonstrate that, under assumptions of continuity in the ordering of preferences, preferences themselves can be expressed through a utility function. A utility function is a numerical function dened over each panel of goods, i.e., it is a function that assigns a numerical utility index to each panel of goods. Preferences are expressed by higher values of the utility function. Therefore, in a competitive exchange market there are N agents, each characterized by a nancial endowment and by a utility function. A panel of goods chosen by each agent corresponds to each set of prices of the goods. The aggregation of choices leads to the aggregate market demand for each good. The key point of microeconomic theory is to determine if and how the market reaches its equilibrium point which is dened as the point at which the aggregate demand is exactly equal to the aggregate oer. This might seem a banal problem, but it involves one of the key results of this century mathematics: Brouwers xed point theorem. The xed-point theorem can be stated in many ways. In its simplest formulation, it states that a continuous function that maps an interval onto itself has a xed point, i.e., a point where the argument of the function has the same value as the function itself. Each level of oer corresponds to a price which induces demand. The equilibrium problem is to nd the xed point of this supply and demand function. The above notion of competitive markets does not consider uncertainty. To extend classical microeconomic analysis to include uncertainty, Kenneth Arrow reasoned as follows. At every instant, agents exchange not only goods but also contracts that will be executed in the future. The future result of such contracts is uncertain. A stock, for example, is a contract that gives its holder the right to receive future dividends and the eventual nal liquidation price, but the amount of these payments at time of execution is uncertain. Suppose, for simplicity, that there is only one period and thus two dates: the initial instant T0 and the nal instant T1 . At instant T0 , agents exchange contracts that give them the right to receive an uncertain amount of goods at instant T1 . Abstracting from physical goods, agents exchange at instant T0 contracts that give them the right to receive an uncertain payment at instant T1 . Following Arrow, suppose that the economy might be, at instant T1 , in one of k possible dierent states. We can now observe that each couple good-state can be considered a dierent good with an associated market supply and demand and a
market price. Each contract will produce a dierent outcome in function of the state realized at instant T1 . Every contract is therefore a contingent claim, i.e., each contract gives the right to a payment or a delivery of goods contingent on the realized state. If we now assume that agents have utility functions dened on quantities for each state, we have placed the analysis of markets under uncertainty in the framework of the analysis of deterministic markets. If we now drop the single-period assumption, we can apply the same reasoning to a model that assumes that agents perform market operations (i.e., they exchange contracts or trade) at each of the M future dates. Utility functions must be dened for each good, for each state, and for each instant. The number of trading instants can be either nite or innite. The above description of nancial markets is highly idealized and serves only as a framework for models and application software. It is, however, the conceptual basis for understanding modern nance theory. In a number of applications, it is used directly. For example, optimization software used in investment management is based on the above theoretical model. Lets now explore how the previous framework translates into a mathematical probabilistic description of nancial markets. It would seem natural to dene the states of the economy as instantaneous states at each trading date. From the point of view of the mathematical description of the economy, however, it is more convenient to stipulate that states are an entire possible history of the economy over a given time period. The following paragraphs will describe how a probability structure can be imposed on this set of states.
The mathematical representation of uncertainty
Todays nance theory is based on the hypothesis that uncertainty about future prices is ineliminable; a fundamental tenet is that no entity can attain a deterministic description of the evolution of the economy with the exception of the evolution of riskfree assets. A typical practical example of a riskfree asset is USA government debt. The inability to build purely deterministic models of the economy calls for the mathematical representation of uncertainty. Probability theory is the mathematical description of uncertainty that presently enjoys the broadest diusion; it is the paradigm of choice for mainstream nance theory. But it is by no means the only one. Competing mathematical paradigms for uncertainty include, for example, fuzzy measures. Though probability as a mathematical axiomatic theory is well known, its interpretation is still the object of debate. There are three basic interpretations of
probability: Probability as intensity of belief, J.M. Keynes, 1921 Probability as relative frequency, R. von Mises, 1928 Probability as an axiomatic system, A. Kolmogorov, 1933. Developed primarily by the Russian mathematician Kolmogorov, the axiomatic theory of probability eliminated the logical ambiguities that plagued probabilistic reasoning prior to his work. Application of the axiomatic theory is, however, a matter of interpretation. In nance theory, probability might have two dierent meanings: 1) as a descriptive concept and 2) as a determinant of the agent decision-making process. As a descriptive concept, probability is used in the sense of relative frequency, similar to its use in the physical sciences. In this sense, the probability of an event is assumed to be approximately equal to the relative frequency of its manifestation in a large number of experiments. This interpretation is unsatisfactory in nance theory for a number of reasons: The approximate equality between relative frequency and theoretical probability cannot be made precise. More fundamentally, in a truly probabilistic environment there can be no denite link between probability and observation. Unless we rule out the possibility of low-probability events, any observation is compatible with any statement of probability. An additional complication comes from the fact that nancial time series have only one realization. Every estimate is made on a single time-evolving series. If stationarity (or a well-dened time process) is not assumed, it is not possible to make statistical estimates. If by probability we refer to the agent decision-making process, there are problems here too. It is assumed that agents are able to associate probability numbers to future events and that decisions are made on the basis of these evaluations. Different assumptions can be made. The strictest assumption is that all agents share both the same probabilistic evaluations and the model assumptions. Billingsley (1986) and Chow and Teicher (1988) oer excellent presentations of probability theory. 2.1 Outcomes and events
The axiomatic theory of probability is based on three fundamental concepts: 1) outcomes, 2) events and 3) measure. The outcomes are the set of all possible results of an experiment or an observation. The set of all possible outcomes is often written as the set . For instance, in the dice game, a possible outcome is
a pair of numbers, one for each face, such as 6 + 6 or 3 + 2; the space is the set of all 36 possible outcomes. Events are sets of outcomes. Continuing with the example of the dice game, a possible event is the set of all outcomes such that the sum of the numbers is 10. Probabilities are dened on events, not on outcomes. To render denitions consistent, events must be a class F of subsets of with the following properties: 1. F is not empty; 2. If A F then A F ; (A is the complement of A made of all those elements of that do not belong to A); 3. If Ai F for i = 1, 2, . . . then Ai F . Every such class is called a -algebra. Any class for which property 3 is valid only for a nite number of sets is called an algebra. Any set that belongs to a class G is said to be measurable with respect to G . Consider a class G of subsets of and consider the smallest -algebra that contains G , dened as the intersection of all the sigma-algebras that contain G . That -algebra is indicated as (G ) and is said to be the -algebra generated by G . A particularly important space in probability is the Euclidean space. Consider rst the real axis R, i.e., the Euclidean space R1 in one dimension. Consider the set formed by all intervals open to the left and all unions and intersections of intervals open to the left. The -algebra generated by this set is represented with the letter B ; sets that belong to B are called Borel sets. Now consider, more in general, the n-dimensional Euclidean space Rn , for n 1. (ennuples of real numbers). Consider the class of all generalized rectangles open to the left and their unions and intersections. The -algebra generated by this class is indicated as Rn ; sets that belong to Rn are called n-dimensional Borel sets. The above construction is not the only possible one. The Rn , for any value of n, are also generated by open or closed sets. As we will see, the Rn are fundamental to dening random variables. They dene a class of subsets of Euclidean spaces on which it is reasonable to impose a probability structure: the class of every subset would be too big while the class of, say, generalized rectangles would be too small. The Rn are an adequately rich class. 2.2 Probability
Intuitively, probability is a set function that associates to every event a number between zero and one. Probability is formally dened by a triple (, F , P ), which is called a probability space, where is the set of all possible outcomes, F is
the event -algebra and P is a probability measure dened as follows. A probability measure P is a set function from F to R (the set of real numbers) that satises three conditions: 1. 0 P (A) 1 for every A F ; 2. P () = 0 and P () = 1; 3. P (Ai ) = P (Ai ) for every nite or numerable sequence of disjoint events Ai such that Ai F . F does not have to be a -algebra. The denition of a probability space can be limited to algebras of events. It is however possible to demonstrate that a probability dened over an algebra of events H can be extended in a unique way to the -algebra generated by H. Two events are said to be independent if: P (A B ) = P (A)P (B ). (1)
The probability of event A given event B , written as P(A/B), is dened as follows: P (A B ) . (2) P (B ) It is immediate to deduct from simple properties of set theory and from the disjoint additivity of probability that: P (A/B ) = P (A B ) = P (A) + P (B ) P (A B ) P (A) + P (B ). P (A) = 1 P (B ). (3) (4)
Discrete probabilities are a special instance of probabilities. Dened over a nite or denumerable set of outcomes, discrete probabilities are non-zero over each outcome. The probability of an event is the sum of the probabilities of its outcomes. In the nite case, discrete probabilities are the usual combinatorial probabilities. 2.3 Measure
A measure is a set function dened over an algebra of sets, denumerably additive, and such that it takes value 0 on the empty set but can otherwise assume any positive value including, conventionally, an innite value. A probability is thus a measure of total mass 1, i.e., it takes value 1 on the set . Measure can be formally dened as a function M (A) from an algebra F to R (the set of real numbers) that satises the following three properties:
1. 0 M (A) for every A F ; 2. M () = 0; 3. M (Ai ) = M (Ai ) for every nite or denumerable sequence Ai of disjoints events such that Ai F . If M is a measure dened over a -algebra F , the triple (, F , M ) is called a measure space (this term is not used if F is an algebra). The couple (, F ) is a measurable space if F is a -algebra. Measures in general, and not only probabilities, can be uniquely extended from an algebra to the generated -algebra. 2.4 Integrals
The notion of measure allows to dene a concept of integral that generalizes the usual concept of the Riemann integral. For each measure M , the integral is a number that is associated to every integrable function f . It is dened in two steps. First suppose that f is non-negative and consider a nite decomposition of the space , that is to say a nite class of disjoint subsets Ai of whose union is : (Ai ; Ai Aj = for i = j ; Ai = ). Then consider the sum: inf (f ( ) : Ai )M (Ai ). The integral f dM is dened as the superior, if it exists, of all these sums over all possible decompositions of . Second, given a generic function f not necessarily non-negative, consider its decomposition in its positive and negative parts. The integral of f is dened as the dierence, if dierence exists, between the integrals of its positive and negative parts with the sign changed. This denition of integral generalizes the usual definition of the Riemann integral. The integral can be dened not only on but on any measurable set G. Given an algebra F , suppose that G and M are two measures and suppose that a function f exists such that G(A) = A f dM , for A F . In this case G is said to have density f with respect to M . 2.5 Measures and integrals over Euclidean spaces
A number of integrals of interest for probability theory are dened over the real axis R and over the n-dimensional spaces Rn (real numbers and ennuples of real numbers respectively). The denition of these integrals requires the denition of various measures with respect to which integrals are dened. Without formally dening each measure, lets recall that the following integrals are dened over Euclidean spaces:
the classical Riemann integral, dened with respect to the length of intervals or areas of rectangles; the Lebesgue integral, dened with respect to the measure space (, Rn , n ), where n is the Lebesgue measure, a measure that generalizes the concept of area; the Stieltjes integral, dened with respect to measures that are in turn dened over nite rectangles.
Random variables
Probability is a set function dened over a space of events; random variables transfer probability from the original space into the space of real numbers. Given a probability space (, F , P ), a random variable X is a function X ( ) dened over the set that takes values in the set R of real numbers and subject to the condition: the set ( : X ( ) x) belongs to the -algebra F for every real number x. In other words, the inverse image of any interval (, x] is an event. It can be easily demonstrated that the inverse image of any union and product of intervals is also an event. A real-valued set function dened over is called measurable with respect to a -algebra F if the inverse image of any Borel set belongs to F . A random variable which is measurable with respect to a -algebra cannot discriminate between events that are not in that -algebra. A random variable X is said to generate G if G is the smallest -algebra in which it is measurable. Given a probability space (, F , P ) and a random variable X , the expected value of X is its integral with respect to the measure P : E [X ] = XdP , where integration is extended to entire space . 3.1 Distributions and distribution functions
Given a probability space (, F , P ) and a random variable X , consider a set A of real numbers that belongs to R1 , i.e., A is a Borel set on the real line. Recall that a random variable is a real-valued measurable function dened over the set of outcomes. Therefore, the inverse image of A, X 1 (A) belongs to F and has a well-dened probability P (X 1 (A)). The measure P thus induces another measure on the real axis called distribution or distribution law of the random variable X given by: p(A) = P (X 1 (A)). It is easy to see that this measure is a probability measure. A random variable therefore transfers on the set of real numbers the probability originally dened over the space .
10
The function F dened by: F (x) = p(, x) = P (X x) is the distribution function of the random variable X . Suppose that there is a function f such that P (A) = A f dx for every set A that belongs to R1 and with respect to the Lebesgue measure. The function f is called a probability density function and the probability P is said to have density f . For every interval (a, b), the property F (a) F (b) = extended to the interval (a, b) holds. 3.2 Random vectors f dx where integration is
The next step is to consider not only one but a set of random variables referred to as random vectors. Random vectors are formed by ennuples of random variables. Consider a probability space (, F , P ). A random variable is a measurable function from to R1 ; a random vector is a measurable function from to Rn . We can therefore write a random vector as a function: f ( ) = (f1 ( ), f2 ( ), . . . , fn ( )). Measurability is dened with respect to the Borel -algebras Rn , with n = 1 for random variables. It can be demonstrated that the function f is measurable F if and only if each component function fi is measurable F . Conceptually, the key issue is to dene joint probabilities, i.e., the probabilities that the n variables are in a given set. For example, consider the joint probability that the ination rate is in a given interval and the growth rate in another given interval. Consider the Borel -algebra Rn on the real n-dimensional space Rn . It can be easily demonstrated that a random vector formed by n random variables Xi , i = 1, 2, . . . , n induces a probability distribution over (Rn , Rn ). In fact, the set ( : (X1 ( ), X2 ( ), . . . , Xn ( )) H : H Rn ) belongs to F , i.e., the inverse image of every set of the -algebra Rn belongs to the -algebra F . It is therefore immediate to induce over every set H that belongs to Rn a probability measure, the joint probability of the n random variables Xi . In general, however, knowledge of the distributions and of distribution functions of each random variable is not sucient to determine the joint probability distribution function. Two random variables X, Y are said to be independent if P (X A, Y B ) = P (X A)P (Y B ), A and B belong to R. This denition generalizes in obvious ways to any number of variables and therefore to the components of a random vector. It is easy to show that, if the variable components of a random vector are independent, the joint probability distribution is the product of distributions.
11
Stochastic processes
Given a probability space (, F , P ), a stochastic process is a set of random variables that are measurable with respect to F , indexed with an index t [0, T ] interpreted as time. A stochastic process is therefore an indexed random variable Xt ( ). When it is necessary to emphasize the dependence of the random variable value from both time and the element , a stochastic process is explicitly written as a function of two variables: X = X (t, ). Given , the function X (t, ) is a function of time that is called the path of the stochastic process. The variable X might be a single random variable or a multidimensional random vector. A stochastic process is therefore a function X (t, ) from the product space [0, T ] into the n-dimensional real space Rn . Because to each corresponds a time path of the process - in general formed by a set of functions Xi (t, ) - it is possible to identify the space with a subset of the real functions dened over an interval [0, T ]. Lets now discuss how to represent a stochastic process X (t, ) and the conditions of identity of two stochastic processes. As a stochastic process is a function of two variables, one can dene equality as pointwise identity for each couple t, . However, as processes are dened over probability spaces, pointwise identity is seldom used; it is more fruitful to dene equality modulo sets of measure zero or equality with respect to probability distributions. In general, two random variables X, Y will be considered equal if the equality X ( ) = Y ( ) holds for every with the exception of a set of probability zero. In this case, it is said that the equality holds almost always (a.a.). A rather general (but not complete) representation is given by the nite dimensional probability distributions. Given any set of indices (t1 , . . . , tm ), consider the distributions t1 ,...,tm (H ) = P ((Xt1 , . . . , Xtm ) H ) where H Rm . These probability measures are, for any choice of the ti , the nite-dimensional joint probabilities of the process. They determine many, but not all, properties of a stochastic process. For example, the nite dimensional distributions of a Brownian motion do not determine if the process paths are continuous or not. In general, one can dene the three concepts of equality between stochastic processes described below: Two stochastic processes are equal if they have the same nite-dimensional distributions. This is the weakest concept of equality. The process X (t, ) is said to be a modication of the process Y (t, ) if the following equation holds: X (t, ) = Y (t, ) a.a., t. (5)
12
In other words, according to this denition, two stochastic processes are equal if, given any value of t, the random variables X (, ), Y (, ) are equal except over a set of probability zero. For each t, the set of measure zero over which the two processes are dierent might be dierent. Two processes are said to be indistinguishable if the following relationship holds: X (t, ) = Y (t, ) t, a.a.. (6)
That is to say, two processes are indistinguishable if their paths coincide except (eventually) over a set of measure zero. It is quite obvious that property 3 implies property 2 which implies, in turn, property 1. Implications do not hold in the opposite sense. Two processes having the same nite distributions might have completely dierent paths. However if one assumes that paths are continuous functions of time, properties 2 and 3 become equivalent. 4.1 Assets, prices, dividends and economic states
We are now in the position to summarize the probabilistic representation of nancial markets. From a nancial point of view, an asset is a contract which gives the right to receive a stream of future payments, generically indicated as dividends. In the case of a stock, the stream of payments will include the stock dividends and the proceedings of the eventual nal liquidation of the rm. A bond is a contract that gives the right to receive coupons and the repayment of the principal. We will suppose that all payments are made at the trading dates and that no transactions take place between trading dates. Lets assume that all securities are traded (i.e., exchanged on the market) at either discrete xed dates, variable dates or continuously. At each trading date there is a market price for each security. Each security is therefore modeled with two time series, a series of market prices and a series of dividends. As both series are subject to uncertainty, dividends and prices are time-dependent random variables, i.e., they are stochastic processes. The time dependence of random variables in this probabilistic setting is a delicate question and will be examined shortly. Following Kenneth Arrow and using a framework now standard, the economy and the nancial markets in a situation of uncertainty are described with the following basic concepts: It is assumed that the economy might be in one of the states of a probability space (, F , P ). Therefore, the economy is represented by a probability space (, F , P ).
13
Every security is described by two stochastic processes formed by two timedependent random variables St and dt , that represent prices and dividends of the same security. Therefore, every security is represented by two stochastic processes St and dt . This representation is completely general and is not linked to the assumption that the space of states is nite. 4.2 Information structures
Lets now turn our attention to the question of time. The previous paragraphs considered a space formed by states in an abstract sense. We have now to introduce an appropriate representation of time as well as rules that describe the evolution of information, i.e. information propagation, over time. The concepts of information and information propagation are fundamental in economics and nance theory. Information, in this context, is a concept dierent from both the intuitive notion of information and from that of information theory in which information is a quantitative measure related to the a priori probability of messages. In economics, information means the (progressive) revelation of the set of events to which the current state of the economy belongs. The concept of information in nance is a bit technical, but sheds light on the probabilistic structure of nance theory. (Readers not interested in the formal development of nance theory might skip this section.) The point is the following. Securities are represented by stochastic processes, i.e., time-dependent random variables. But the probabilistic states on which these random variables are dened represent entire histories of the economy. To embed time into the probabilistic structure of states in a coherent way calls for information structures and ltrations. Recall that it is assumed that the economy is in one of many possible states and that there is uncertainty on the state that has been realized. Consider a time period of the economy. At the beginning of the period, there is complete uncertainty on the state of the economy, i.e., there is complete uncertainty on what path the economy will take. Dierent events have dierent probabilities, but there is no certainty. As time passes, uncertainty is reduced as the number of states to which the economy can belong is progressively reduced. Revelation of information means the progressive reduction of the number of possible states; at the end of the period, the realized state is fully revealed. This progressive reduction of the set of possible states is formally expressed in the concepts of information structure and ltration. Lets start with information structures. Information structures apply only to discrete probabilities dened
14
over a discrete set of states. At the initial instant T0 , there is complete uncertainty on the state of the economy; the actual state is known only to belong to the largest possible event, i.e., the entire space . At the following instant T1 , the states are separated into a partition, a partition being a denumerable class of disjoint sets whose union is the space itself. The actual state belongs to one of the sets of the partitions. The revelation of information consists in ruling out all other sets but one. In the discrete case, and only in the discrete case, partitions are determined by the value of all the random variables at time T1 . For all the states of each partition, and only for these, random variables assume the same values. Suppose, to exemplify, that only two securities exist in the economy and that each can assume only two possible prices and pay only two possible dividends. At every moment there are sixteen possible price-dividend combinations. We can thus see that at the moment T1 all the states are partitioned into sixteen sets, each containing only one state. Each partition includes all the states that have a given set of prices and dividends at the moment T1 . The same reasoning can be applied to each instant. The evolution of information can thus be represented by a tree structure in which every path represents a state and every point a partition. Obviously the tree structure does not have to develop in a symmetrical way as in the above example. The tree might have a very generic structure of branches. 4.3 Filtration
The concept of information structure based on partitions supplies a rather intuitive representation of the propagation of information through a tree of progressively ner partitions. However, this structure is not sucient to describe the propagation of information in a general probabilistic context. In fact, the set of possible events is much richer than the set of partitions. It is therefore necessary to identify not only partitions, but also a structure of events. The structure of events used to dene the propagation of information is called a ltration. In the discrete case, however, the two concepts, information structure and ltration, are equivalent. The concept of ltration is based on identifying all events that are known at any given instant. It is assumed that it is possible to associate to each trading moment t a -algebra of events Ft contained in F and formed by all events that are known prior to or at time t. It is assumed that events are never forgotten, i.e., that Ft Fs if t < s. In this way, an ordering of time is created. This ordering is formed by an increasing sequence of -algebras, each associated to the time at which all its events are known. This sequence is called a ltration. Indicated as {Ft }, a ltration is therefore the increasing sequence of all -algebras Ft , each associated to the respective instant t. In the nite case, it is possible to create a mutual correspondence between l-
15
trations and information structures. In fact, given an information structure, it is possible to associate to each partition the algebra generated by the same partition. Observe that a tree information structure is formed by partitions that create increasing renement, that is to say that, by going from one instant to the next, every set of the partition is decomposed. One can then conclude that the algebras generated by an information structure form a ltration. On the other hand, given a ltration {Ft }, it is possible to associate a partition to each Ft . In fact, given any element , consider any other element such that, for each set of Ft , both either belong to or are outside it. It is easy to see that classes of equivalence are thus formed, that these create a partition, and that the algebra generated by each such partition is exactly the Ft that has generated the partition. A stochastic process is said to be adapted to the ltration {Ft } if the variable Xt is measurable with respect to the -algebra Ft . It is assumed that the price and dividend processes St and dt of every security are adapted to Ft . This means that, for each t, no measurement of any price or dividend variable can identify events not included in the respective algebra. Every random variable is a partial image of the set of states seen from a given point of view and at a given moment. The concepts of ltration and of processes adapted to a ltration are fundamental. They ensure that information is revealed without anticipation. Consider the economy and associate at every instant a partition and an algebra generated by the partition. Every random variable dened at that moment assumes a value constant on each set of the partition. The knowledge of the realized values of the random variables does not allow identifying sets of events ner than partitions. One might well ask: Why introduce the complex structure of -algebras as opposed to simply dening random variables? The point is that, from a logical point of view, the primitive concept is that of states and events. The evolution of time has to be dened on the primitive structure - it cannot simply be imposed on random variables. In practice, ltrations become an important concept when dealing with conditional probabilities in a continuous environment. As the probability that a continuous random variable assumes a specic value is zero, the denition of conditional probabilities requires the machinery of ltration. 4.4 Conditional probability and conditional expectation
Conditional probabilities and conditional averages are fundamental in the stochastic description of nancial markets. For instance, one is often interested in the probability distribution of the price of a security at some date given its price at an earlier date. The widely used regression models are an example of conditional expectation models.
16
The conditional probability of event A given event B was dened in the above (AB ) paragraphs on probability as P (A/B ) = P P . This simple denition cannot (B ) be used in the context of continuous random variables because the conditioning event (i.e., one variable assuming a given value) has probability zero. To avoid this problem, one conditions on -algebras and not on single zero-probability events. In general, as each instant is characterized by a -algebra Ft , the conditioning elements are the Ft . The general denition of conditional expectation is the following. Consider a probability space (, F , P ) and a -algebra G contained in F and suppose that X is an integrable random variable on (, F , P ). We dene the conditional expectation of X with respect to G , written as E [X/G ], a random variable measurable with respect to G such that G E [X/G ]dP = G XdP for every set G G . In other words, the conditional expectation is a random variable whose average on every event that belongs to G is equal to the average of X over those same events but it is measurable G whilst X is not. It is possible to demonstrate that such variables exist and are unique up to a set of measure zero. Econometric models usually condition a random variable given another variable. In the previous framework, conditioning one random variable X with respect to another random variable Y means conditioning X given (Y ), i.e., given the algebra generated by Y . Thus E [X/Y ] means E [X/ (Y )]. One can dene conditional probabilities starting from the concept of conditional expectations. Consider a probability space (, F , P ), a sub- -algebra G of F and two sets A, B F . If IA , IB are the indicator functions of the sets A, B (the indicator function of a set assumes value 1 on the set, 0 elsewhere), we can dene conditional probabilities of the event A, respectively, given G or given the event B as: P (A/G ) = E [IA /G ], P (A/B ) = E [IA /IB ] (7)
Using these denitions, it is possible to demonstrate that given two random variables X and Y with joint density f (x, y ), the conditional density of X given Y is f (x, y ) f (x/y ) = . (8) fY (y ) In the discrete case, the conditional expectation is a random variable (i.e., a realvalued function dened over ) that takes a constant value over the sets of the nite partition associated to Ft . Its value for each element of is dened by the classical concept of conditional probability. It is simply the average over a partition assuming the classical conditional probabilities. An important econometric concept related to conditional expectations is that of a martingale. Given a probability space (, F , P ) and a ltration {Fi },
17
a sequence of random variables Xi measurable Fi is called a martingale if the following condition holds: E [Xi+1 /Fi ] = Xi . (9)
A martingale translates the idea of a fair game as the expected value of the variable at the next period is the present value of the same value.
In summary
This tutorial has reviewed the following key concepts used in the probabilistic description of nancial markets: Probability and probability spaces : Probability is a set function dened over a class of events where events are sets of possible outcomes of an experiment. A probability space is a triple formed by a set of outcomes, a -algebra of events and a probability measure. Random variables and random vectors : A random variable is a real-valued function dened over the set of outcomes such that the inverse image of any interval is an event. n -dimensional random vectors are functions from the set of outcomes into the n -dimensional Euclidean space with the property that the inverse image of n -dimensional generalized rectangles is an event. Stochastic processes : Stochastic processes are time dependent random variables. Information structures and ltrations : An information structure is a class of partitions of the -algebra of events associated to each instant of time that become progressively ner with the evolution of time. A ltration is an increasing class of -algebras associated to each instant of time. The stochastic representation of nancial markets : The states of the economy, intended as full histories of the economy, are represented as a probability space. The revelation of information with time is represented by information structures or ltrations. Prices and other nancial quantities are represented by adapted stochastic processes. Conditional probabilities and conditional expectations : Conditioning means the change in probabilities due to the acquisition of some information. It is possible to condition with respect to an event if the event has non zero probability. In general terms, conditioning is conditioning with respect to a ltration or an information structure. Martingales : A martingale is a stochastic process such that the conditional expected value is always equal to its present value. It embodies the idea of a fair game, where todays wealth is the best forecast of future wealth.
18
References
[1] Billingsley, Patrick, Probability and Measure, 2nd edition, Wiley and Sons, New York, NY, 1986. [2] Chow, Yuan Shih and Henry Teicher, Probability Theory, Springer-Verlag, New York, NY, 1988. [3] Varian, Hal, Microeconomic Theory, W.W Norton & Company, 1992.

Probability Theory

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Probability Theory

Transféré par

Droits d'auteur :

Formats disponibles

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

The mathematical representation of uncertainty

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

A primer on probability theory in nancial modeling

Vous aimerez peut-être aussi