Vous êtes sur la page 1sur 26

Notes On Asset Pricing

Ravi Shukla

First Draft: 1984


This version: October 1997

Contents
1 Notation 2

2 Definitions 2

3 Mean-Variance Frontier 3

4 An Alternative Derivation Of The Frontier 6

5 Two Fund Separation 7

6 Properties Of The Funds 9

7 Asset Pricing Equations 10

8 Analysis In Presence Of A Risk-free Asset 11

9 Yet Another Proof Of CAPM 14

10 Problems With Mean-Variance Analysis 15

11 The Arbitrage Pricing Theory 16

12 A Critical Look At APT 19

13 Pricing Error 20

14 Estimation and Identification 22

15 Arbitrage Portfolio Selection 24

1
These notes are a compilation of some basic ideas in asset pricing. The motivation
behind these notes is to have the rigorous derivations and related concepts available
in a consistent notation in one place. Many of the concepts in mean-variance analysis
can be found in Merton (JFQA, September 1972) and Roll (JFE, 1977; JF, September
1978; JFQA, December 1980).
The first two sections set up the notation and provide the definitions of some of
the common concepts. Sections 3 through 9 do the mean-variance analysis and derive
various forms of the capital asset pricing equations. Section 10 provides a criticism of
the mean-variance analysis. Section 11 onwards are devoted to the development of the
arbitrage pricing theory.

1 Notation
Let the market have n risky assets. Then we define the following notation:1

The identity vector: ` = [1 1 . . . 1 ]T .


The zero vector: Z = [0 0 . . . 0 ]T .
The asset returns: R̃ = [r̃1 r̃2 . . . r̃n ]T .
The portfolio weights: W = [w1 w2 . . . wn ]T .
The mean returns on assets: M = [µ1 µ2 . . . µn ]T .
 
σ11 σ12 . . . σ1n
 σ21 σ22 . . . σ2n 
 
The covariance matrix of returns: Σ =  .. .. .. .. .
 . . . . 
σn1 σn2 ... σnn

ΣT = Σ and rank (Σ) = n. The rank condition guarantees that there are no linear
dependencies among asset returns and that there is no risk-free asset among these n
assets.

2 Definitions
A risk-free portfolio is defined to have weights Wf such that:

WfT ΣWf = 0,
WfT ` = 1,
WfT M = rf > 0.

Since rank( Σ) = n, Σ is nonsingular and no risk-free portfolios exist.


1 Boldface notation is used for matrices and vectors. The superscript T denotes transpose.

2
Orthogonal (zero covariance) portfolios Z and Z are defined to have weights
WZ and WZ such that:
T
WZ
ΣWZ = 0,
T
WZ
` = 1,
T
WZ ` = 1

The no-arbitrage condition for a portfolio with weights W is defined as:

WT ΣW = 0,
T
W ` = 0,
WT R̃ = WT M = 0.

Since Σ is nonsingular, only a trivial solution W = Zn exists.


The pricing objective is to arrive at a functional relationship between M and Σ,
i.e., we need a function f (·) such that:

M = f (Σ)

The investor’s objective is to find an efficient portfolio, i.e., a W such that:


Max h i
E U (WT R̃) , (2.1)
W
s.t. WT ` = 1. (2.2)

Quadratic utility function or joint normality of asset returns transforms the problem
to:
Max
F (WT M, WT ΣW), (2.3)
W
s.t. WT ` = 1, (2.4)

where F is the function obtained after the transformation. For risk-averse investors,
F1 > 0 and F2 < 0 where Fk denotes the partial derivative with respect to the k th
argument, k = 1, 2.

3 Mean-Variance Frontier
Following the discussion above we will assume that the normality or the quadratic
utility function assumption is satisfied, so that we can limit our analysis to mean and
variance only. We will also assume unanimity of beliefs so that the problem of ag-
gregation is avoided.2 Realizing the properties of F , we can write the problem for
2 This is not very crucial to our results. Several authors have shown that the mean-variance analysis holds

even with divergence of beliefs (Mossin, Sharpe, Fama).

3
risk-averse investors as:
Min
σ 2 = WT ΣW,
W
s.t. WT M = µ,
WT ` = 1,

where µ is the desired level of the portfolio expected return. The solution to this prob-
lem will result in a parametric equation for the efficient frontier. Writing the Lagrangian
the problem is transformed to:
Min
WT ΣW − λ1 (WT M − µ) − λ2 (WT ` − 1).
W
The first order conditions give us:

2WT Σ − λ1 (MT ) − λ2 (`T ) = 0, (3.1)


WT M = µ, (3.2)
WT ` = 1. (3.3)

From (3.1),
1h i
WT = λ1 MT + λ2 `T Σ−1 ,
2
so that by post-multiplying,
1 1
WT M = λ1 MT Σ−1 M + λ2 `T Σ−1 M = µ
2 2
1 T −1 1 T −1
WT ` = λ1 M Σ ` + ` Σ ` = 1
2 2
Define

A = `T Σ−1 M = MT Σ−1 ` (3.4)


B = MT Σ−1 M (3.5)
C = `T Σ−1 ` (3.6)

to get
1 1
λ1 B + λ2 A = µ,
2 2
1 1
λ1 A + λ2 C = 1,
2 2
which can be solved to give

2(Cµ − A) 2(B − Aµ)


λ1 = and λ2 =
D D

4
where D = BC − A2 > 0.3 Substituting for λ1 and λ2 in (3.4) we get:
 
Cµ − A T B − Aµ T
T
W = M + ` Σ−1 . (3.7)
D D
This defines the composition of the mean-variance frontier portfolio corresponding to
expected return µ. Now we want to get the equation for the mean-variance frontier. For
that we have to eliminate W from the equations. Post-multiplying equation (3.8) by
ΣW we get:
Cµ − A T B − Aµ T
WT ΣW = M W+ ` W
D D
but MT W = WT M = µ, `T W = WT ` = 1 and WT ΣW = σ 2 , so that,
Cµ − A B − Aµ
σ2 = µ+ ,
D D
Cµ2 − 2Aµ + B
= ,
D
which is a parabola in (µ, σ 2 ) plane.4
Since
dσ 2 2(Cµ − A) A
= =0 ⇒ µ= ,
dµ D C
A 1
the minimum variance portfolio in (µ, σ 2 ) plane is located at ( C , C ). The composition
T 1 T −1
of the minimum variance portfolio is given by W = C ` Σ . The efficient frontier
is the part of the parabola for which a higher variance (σ 2 ) is associated with a higher
2

dµ > 0. Investors will choose portfolios that lie on this part


expected return (µ) i.e., dσ
of the frontier. Since,

dσ 2 2(µC − A) 2C(µ − C
A
)
= = ,
dµ D D
2(µ − µ) A 1
= where µ= and σ2 = ,
Dσ 2 C C
we get µ > µ for the efficient portion of the frontier.
3 Since Σ−1 is positive definite, the quadratic form (AM − B`)T Σ−1 (AM − B`) > 0. Which can

be simplified to:
A2 MT Σ−1 M + B 2 `T Σ−1 ` − 2ABMT Σ−1 ` > 0,
A2 B + B 2 C − 2A2 B > 0,
B(BC − A2 ) > 0,
−1
and since B > 0 (it is a quadratic form of Σ ) we get the desired result. Also note that C > 0 for the
same reason as B.
4 For the purposes of plotting the curve, the following form of the equation is more useful:
r
A D 2 D
µ= ± σ − 2.
C C C

5
Let us examine some of the properties of the frontier in the (µ, σ) plane. The
frontier is a hyperbola in this plane.

dσ dσ dσ 2 1 dσ 2
= 2
= ,
dµ dσ dµ 2σ dµ
µ−µ
= ,
Dσσ 2
which means that the efficient portion of the frontier has the positive slope and
 
d2 σ 1 d2 σ 2 1 −1 dσ dσ 2
= + ,
dµ2 2σ dµ2 2 σ2 dµ dµ
 
1 d2 σ 2 1 1 dσ 2 dσ 2
= − ,
2σ dµ2 2σ 2 2σ dµ dµ
"  2 2 #
1 d2 σ 2 1 dσ
= − ,
2σ dµ2 2σ 2 dµ
"  2 #
1 2 1 2(µ − µ)
= − 2 ,
2σ Dσ 2 2σ Dσ 2
 
1 2 2 (µ − µ)2
= − 2 ,
2σ Dσ 2 σ D2 σ 4
1 (µ − µ)2
= 0 if = ,
Dσ 2 σ2 D2 σ4
or Dσ 2 σ 2 = (µ − µ)2 .

This condition is not met by all µ and σ. This implies that the efficient frontier in
(µ, σ) plane is not a straight line. The efficient frontier is a straight line only when Σ
is not positive definite which happens when either there is a risk-free asset or the assets
are perfectly (positively or negatively) correlated.

4 An Alternative Derivation Of The Frontier


In section 3 we minimized the variance for a given level of expected return. In this
section we maximize the expected return for a given level of variance. Let us write the
problem as:
Max
µ = WT M,
W
s.t. WT ΣW = σ 2 ,
WT ` = 1.

Form the Lagrangian to get:


Max
WT M − λ1 (WT ΣW − σ 2 ) − λ2 (WT ` − 1).
W

6
The first order conditions:
MT − λ1 (2WT Σ) − λ2 (`T ) = 0, (4.1)
T 2
W ΣW = σ , (4.2)
WT ` = 1. (4.3)
From (4.1)
1 h T i
WT = M − λ2 `T Σ−1 ,
2λ1
so that by post-multiplying,
1 h T i
WT M = µ= M − λ2 `T Σ−1 M,
2λ1
1 h T i
and WT ` = 1 = M − λ2 `T Σ−1 `.
2λ1
Simplifying using the definitions of A, B and C from above:
2λ1 µ + λ2 A = B,
and 2λ1 + λ2 C = A.
BC − A2
⇒ λ1 =
2(µC − A)
µA − B
and λ2 =
µC − A
Substituting these in equation (4.4) and simplifying we get the same form as (3.8).

5 Two Fund Separation


In this section we want to show that all the portfolios on the mean-variance frontier can
be obtained by linear combinations of two distinct funds (portfolios) P and Q defined
by WP and WQ . The compositions of the funds P and Q should be independent of
the expected value and the variance of the efficient portfolio to be generated so that
all portfolios chosen by the investors can be replicated by combinations of these two
funds. Therefore, the investors just need to look at these two funds instead of all the
assets in the market. Let us define x as the amount invested in fund P and (1 − x) in
Q, and consider the following portfolio:
WT = xWPT + (1 − x)WQ
T
,
= x(WPT − WQ
T T
) + WQ . (5.1)
From (3.8), the equation for the weights of a portfolio on the mean-variance frontier is:
 
Cµ − A T B − Aµ T
W T
= M + ` Σ−1 ,
D D
" # " #
CMT − A`T −1 B`T − AMT
= µ Σ + Σ−1 ,
D D
= µGT + HT , (5.2)

7
T T
B`T −AMT −1
where GT = CM D−A` Σ−1 and HT = D Σ .
Comparing (5.2) and (5.1) we get:

x(WPT − WQ
T
) = µGT + HT − WQ
T
.

Assuming that [(WPT − WQ


T
)(WPT − WQ
T T
) ] is nonzero,
 
T T −1
x = µGT (WPT − WQ
T T
) (WPT − WQ T
)(WPT − WQ )
 
T T −1
+(HT − WQT
)(WPT − WQ
T T
) (WPT − WQ T
)(WPT − WQ ) ,
= µδ − α δ 6= 0 because WPT 6= WQ
T
.

δ and α are as defined above. Substituting for x in (5.1) we get:

WT = µδ(WPT − WQ
T
) − α(WPT − WQ
T T
) + WQ . (5.3)

Comparing (5.3) with (5.2) and realizing that WP and WQ must be independent
of µ, we get:

GT = δ(WPT − WQ
T
),
and HT = T
WQ − α(WPT − WQ
T
),
which can be solved to give:
1+α T
WPT = G + HT = µP GT + HT ,
δ
T α T
and WQ = G + HT = µQ GT + HT
δ
with
1+α T
µP = WPT M = G M + HT M,
δ
1 + α CB − A2 BA − AB
= + ,
δ D D
1+α
= . (5.4)
δ
Similarly,
T α
µQ = W Q M = , (5.5)
δ
from which
1 µQ
δ= and α= .
µP − µQ µP − µQ

WP and WQ are not unique. For different values of α and δ different pairs of
funds may be selected. Note that funds P and Q can generate all frontier portfolios, not
just the efficient ones. Also note from the expressions for WP and WQ and comparing
them with (5.2) that funds P and Q lie on the mean-variance frontier but not necessarily
on the efficient part of it.

8
6 Properties Of The Funds
WPT ΣWP = µP GT ΣWP + HT ΣWP ,
CMT − A`T B`T − AMT
= µP WP + WP ,
D D
CMT WP − A`T WP B`T WP − AMT WP
= µP + ,
D D
 1+α  1+α
1 + α C( δ ) − A B − A( δ )
= + ,
δ D D
C(1 + α)2 − Aδ(1 + α) Bδ − A(1 + α)
= + ,
Dδ 2 Dδ
or,
C(1 + α)2 + Bδ 2 − 2Aδ(1 + α)
σP2 = . (6.1)
Dδ 2

T
WQ ΣWQ = µQ GT ΣWQ + HT ΣWQ ,
α C( αδ ) − A B − A( αδ )
= + ,
δ D D
or,
2 Cα2 − 2Aαδ + Bδ 2
σQ = . (6.2)
Dδ 2
Combining (6.1) and (6.2) and simplifying we get:
C + 2(αC − Aδ)
σP2 = σQ
2
+ .
Dδ 2
Finally,
WPT ΣWQ = µP GT ΣWQ + HT ΣWQ ,
CMT − A`T B`T − AMT
= µP WQ + WQ ,
D D
CµQ − A B − AµQ
= µP + ,
D D
C( 1+α 1+α
δ )( δ ) − A( δ ) + B − A( δ )
α α
= ,
D
Cα2 − 2Aαδ + Cα + Bδ 2 − Aδ
= , (6.3)
Dδ 2
or,
2 Aδ − Cα
σP Q = σQ − . (6.4)
Dδ 2
If we restrict both the funds to be efficient portfolios and without loss of generality
let µP > µQ ≥ µ = C A
and σP2 > σQ 2
, then δ > 0 and α ≥ Aδ C . Then from (6.4)
σP Q = 0 requires:
2 Aδ − αC
σQ = <0
Dδ 2

9
which is impossible. This means that there are no two funds that lie on the efficient
portion of the frontier and are orthogonal.

7 Asset Pricing Equations


Let us write the composition of the fund portfolios as:

CMT − A`T −1 B`T − AMT −1


WPT = µP Σ + Σ , (7.1)
D D
T CMT − A`T −1 B`T − AMT −1
WQ = µQ Σ + Σ , (7.2)
D D
where µP = 1+α δ and µQ = δ and α, δ, A, B, C, and D are as defined in section 3.
α

Also from the properties of the funds:


µP (CµP − A) + (B − AµP )
Var(r̃P ) = WPT ΣWP = ,
D
µP (CµQ − A) + (B − AµQ )
Cov(r̃P , r̃Q ) = WPT ΣWQ = .
D
For an arbitrary portfolio X with composition WX we can write:
µP (CµX − A) + (B − AµX )
Cov(r̃P , r̃X ) = WPT ΣWX = .
D
Combining these three equations we get:
Cov(r̃P , r̃X ) − Cov(r̃P , r̃Q ) µX − µQ
= ,
Var(r̃P ) − Cov(r̃P , r̃Q ) µP − µQ
or
Cov(r̃P , r̃X ) − Cov(r̃P , r̃Q )
µX = µQ + (µP − µQ ) .
Var(r̃P ) − Cov(r̃P , r̃Q )
If the two funds are orthogonal, i.e. Cov(r̃P , r̃Q ) = 0, we get the popular zero-beta
asset pricing model, i.e.,
µX = µQ + (µP − µQ )βX , (7.3)
where βX = Cov(r̃P , r̃X )/Var(r̃P ). Note that in the zero-beta equation, one of the
funds is on the inefficient portion of the frontier.
Alternatively, subtracting (7.2) from (7.1) we get:

CMT − A`T −1
WPT − WQ
T
= (µP − µQ ) Σ .
D
which leads to:
" #
A D WPT − WQT
MT = `T + Σ. (7.4)
C C µP − µQ

10
This is the answer to our pricing objective alluded in section 2. Post-multiplying
(7.4) with the weight vector for an arbitrary portfolio X, we get:

A T D WPT ΣWX − WQ T
ΣWX
M T WX = ` WX + ,
C C µP − µQ
which leads to the following form of the asset pricing equation:
A D Cov(r̃P , r̃X ) − Cov(r̃Q , r̃X )
µX = + . (7.5)
C C µP − µQ

8 Analysis In Presence Of A Risk-free Asset


Now we introduce a risk-free asset in our economy. In this situation the objective of
the investor can be written as:
Min
σ 2 = WT ΣW,
W
s.t. WT M + (1 − WT `)rf = µ,

where rf is the return on the risk-free asset. Writing the Lagrangian:


Min
[WT ΣW + λ(µ − WT M − (1 − WT `)rf )].
W
The first order conditions give us:

2WT Σ − λ(MT − `T rf ) = (8.1)


WT M + (1 − WT `)rf = µ. (8.2)

From (8.1)
λ
WT = (MT − `T rf )Σ−1 . (8.3)
2
Multiplying (8.3) by M, we get:

λ λ
WT M = (MT − `T rf )Σ−1 M = (B − Arf ), (8.4)
2 2
where A and B are as defined in section 3. Combining (8.2) and (8.4) and simplifying
we get:

2(µ − rf + WT `rf )
λ= . (8.5)
B − Arf
Post-multiplying (8.3) by ` we get:

λ λ
WT ` = (MT − `T rf )Σ−1 ` = (A − Crf ). (8.6)
2 2

11
Substitute for WT ` from (8.6) in (8.5) and simplify to get:

2(µ − rf )
λ= . (8.7)
B − 2Arf + Crf2

Substitute this in (8.3) to get:


µ − rf h i
WT = 2 MT − `T rf Σ−1 . (8.8)
B − 2Arf + Crf

WT gives us the portions of wealth invested in the risky assets. (1−WT `) portion
of the wealth is invested in the risk-free asset. Now,
µ − rf
σ 2 = WT ΣW = [MT − `T rf ]W,
B − 2Arf + Crf2
(µ − rf )2
= .
B − 2Arf + Crf2

Since σ 2 > 0, (B − 2Arf + Crf2 ) > 0. Now we get the equation of the frontier as:
q
µ − rf = ±σ B − 2Arf + Crf2 ,

and the efficient frontier, the region for which dµ


dσ > 0 is given by:
q
µ − rf = σ B − 2Arf + Crf2 .

For σ = 0 ⇒ µ = rf . σ = 0 is obtained by `T W = 0 and WT M = 0, or


W = 0, i.e., investing all the wealth in the risk-free asset.
Now let us do two fund separation in this economy. We assume that the two funds
T
are P and Q as before. Also define W = [WT | wf ], where wf is the portion invested
in the risk-free asset. If x is the portion invested in the fund P then:
T T T T
W = x(WP − WQ ) + WQ . (8.9)

From the composition of the efficient portfolio, equation (8.8), we get:

[MT − `T rf ]
WT = (µ − rf )UT where UT = Σ−1
B − 2Arf + Crf2

and,

wf = 1 − (µ − rf )UT `.

Following the steps in section 5, we can write:

x = δ(µ − rf ) + 1 − α, δ 6= 0,

12
so that equation (8.9) becomes:
T T T T T T
W = δ(µ − rf )(WP − WQ ) + (1 − α)(WP − WQ ) + WQ .

Using the same arguments as in section 5, we get for the risky assets:

δ(WPT − WQ
T
) = UT ,
and (1 − α)(WPT − WQ
T T
) + WQ = 0,

which give us

αUT
WPT =
δ
and
T (α − 1)UT
WQ = ,
δ
and for the risk-free asset:
α(A − rf C)
wfP = 1 − WPT ` = 1 − ,
δ(B − 2Arf + Crf2 )
T (α − 1)(A − rf C)
wfQ = 1 − WQ `=1− .
δ(B − 2Arf + Crf2 )

If we want the fund Q to be the risk-free asset and fund P to be made up of risky
assets only, we get from the condition for the fund Q: wfQ = 1 and WQ = 0 which
imply α = 1. From the condition on fund P we get wfP = 0 which, together α = 1,
gives us:
A − rf C
δ= ,
B − 2Arf + Crf2

since δ 6= 0 we get rf 6= A
C. Under these conditions we get:

[MT − `T rf ] −1
WPT = Σ (8.10)
(A − rf C)

It can be easily verified that µQ = rf . For fund P:

[MT − `T rf −1 B − rf A
µP = WPT M = Σ M= , (8.11)
(A − rf C) A − rf C
and
µP − rf
Var(r̃P ) = WPT ΣWP = . (8.12)
A − rf C
For P to be an efficient portfolio, µP > rf . Then from nonnegativity of Var(r̃P )
A
we get rf < C .

13
Now for an arbitrary portfolio X with weights WX ,

[MT WX − `T WX rf ] µX − rf
Cov(r̃X , r̃P ) = WPT ΣWX = = .
(A − rf C) A − rf C

which can be combined with (8.12) to give the security market line of the capital asset
pricing model:

µX = rf + (µP − rf )βX . (8.13)


Cov(r̃X ,r̃P )
where βX = Var(r̃P ) . P is now recognized to be the market portfolio.

9 Yet Another Proof Of CAPM


In many finance textbooks, CAPM is derived graphically by deriving the capital market
line that maximizes the slope of the line from the risk-free rate to a portfolio of risky
assets. The following proof is based on this idea.
The investor’s objective is to choose a portfolio M of risky assets WM such that the
−rf T
slope µMσM is maximized, subject to the condition WM ` = 1. Here µM = WT M
2 T
and σ = WM ΣWM are the expected return and the variance on the portfolio M ,
respectively. The form of the budget constraint is not very important because once
the M portfolio is defined as a solution to this problem, the investor can take any
combination of the risk-free asset and this particular portfolio. Form the Lagrangian to
get:
Max µM − rf T
− λ(WM ` − 1).
WM σM
The first order conditions give us:

1 ∂µM ∂(1/σM )
+ (µM − rf ) − λ` = 0, (9.1)
σM ∂WM ∂WM
and
T
WM ` = 1 (9.2)

Simplify (9.1):
2
1 ∂(1/σM ) ∂σM ∂σM
M + (µM − rf ) 2 = λ`,
σM ∂σM ∂σM ∂WM
1 −1 1
M + (µM − rf )( 2 )( )2ΣWM = λ`,
σM σM 2σM
1 (µM − rf )
M− 3 ΣWM = λ`. (9.3)
σM σM

14
T
Pre-multiply by WM to get:
T
WM M (µM − rf ) T T
− 3 WM ΣWM = λWM `,
σM σM
µM (µM − rf ) 2
− 3 σM = λ,
σM σM
rf
⇒ λ = .
σM
Substitute this value of λ in (9.3) to get:
1 (µM − rf ) rf
M− 3 ΣWM = `,
σM σM σM
(µM − rf )
⇒ M− 2 ΣWM = rf ` (9.4)
σM
T
Multiply (9.4) by WX the weights of an arbitrary portfolio X, to get:

T µM − rf T T
WX M− 2 WX ΣWM = rf WX `,
σM
µM − rf
⇒ µX − 2 σXM = rf ,
σM
which gives us the capital asset pricing model:
σXM
µX = rf + (µM − rf ) 2 .
σM
Now we can solve for the composition of the market portfolio WM . Rewrite (9.4)
as:
µM − rf
2 WM = Σ−1 (M − rf `).
σM

Pre-multiplying by `T and recognizing the budget constraint `T WM = 1 we get:


µM − rf
2 = `T Σ−1 (M − rf `),
σM
= A − Crf , (9.5)

where A and C are as defined in section 3. Substitute (9.6) in (9.5) to get:


M − rf `
WM = Σ−1 . (9.6)
A − Crf

10 Problems With Mean-Variance Analysis


The final result of the mean-variance analysis is very attractive due to its seemingly
simple linear form. All the asset-pricing equations derived in the previous sections

15
viz. (7.3), (7.5), (8.13) and (9.5) simply state that there is a linear tradeoff between the
expected return and a measure of risk. The risk is measured by some combination of the
covariance of the asset’s returns with that of the mutual funds. The verifiability of the
entire mean-variance analysis, therefore, depends on the empirical measurability of the
funds. Theoretically, we must consider all the assets in the economy while obtaining
the funds. Empirically, this is an impossible task. The true verification of the theory is,
therefore, not possible. This is the main concern of Roll’s critique of the mean-variance
capital asset pricing models.
Another problem with the theory is the justifiability of the assumptions that lead to
the suitability of analysis in the mean-variance framework. Imposing quadratic utility
functions on the investors is very difficult to justify. There may be some justification
for the distributional assumption. Empirical analysis by Fama and many others pro-
vides some hope that asset returns may belong to the stable paretian family. Recently,
Chamberlain has done some theoretical work outlining the kind of distributions that
permit mean-variance analysis. But one cannot say with confidence that the distribu-
tional assumptions fit the data.
There is also the question of justification of expected utility maximization. As-
sumption of rationality on part of the investors so that they fulfill all the axioms for
measurable utility functions has been under fire from time to time. Apparently it is
much easier to maintain these assumptions while considering a narrow range of wealth
than a wide one.
Finally, every test of the asset-pricing result involves a joint test of the assumption
of informational efficiency of financial markets.
The major discontent with the mean-variance analysis on the theoretical front has
been as discussed in the first two paragraphs. On the empirical side, there have been
questions about the degree of systematic noise not incorporated in the model. The
commonly known arguments on this front surfaced in the form of the size effect.
There has been a general awareness for a long time that a better model is needed.
Multiperiood models of Merton, Breeden etc. were efforts in this direction but they
never gained as much popularity. Ross (1976,1977) capitalized on the popular mul-
tifactor asset returns model and combined that with a simple implication of market
efficiency to arrive at another linear, approximate asset pricing model called the arbi-
trage pricing model. The biggest plus with the arbitrage pricing model is that it can be
tested without looking at the entire population of assets.

11 The Arbitrage Pricing Theory


The arbitrage pricing theory (APT) is based on the basic underlying assumption of
commonality of factors governing the returns of the assets in the economy. Multifactor
models for asset returns generation have been popular and in use for a long time. The
multifactor model of returns generation can be expressed as:
r̃i = µi + βi1 δ̃1 + βi2 δ̃2 + . . . + βik δ̃k + ˜i (11.1)

Here δ̃s are the economy-wide common independent factors that affect the returns
of all the assets in the economy to different degrees. βik is the sensitivity of the ith as-

16
set’s returns to the k th common factor. ˜i is the idiosyncratic or the unique component
of the asset return that is not related to any of the common factors. In the matrix form
we can write:
˜ + ε̃ε
R̃ = M + B∆ (11.2)

R̃ and M are the n × 1 realized and expected return vectors for the assets, respec-
tively, as defined in section 1.
 
β11 β12 . . . β1k
 β21 β22 . . . β2k 
 
B= . .. .. .. 
 .. . . . 
βn1 βn2 ... βnk

is the n × k matrix of sensitivities or the factor loadings,

∆˜ = [δ̃1 δ̃2 . . . δ̃k ]T


is the k × 1 vector of realized factor scores, and
ε̃ε = [˜ 1 ˜2 . . . ˜n ]T

is the n × 1 vector of idiosyncratic components of returns.


By definition of the expected returns, both the factor scores and the idiosyncratic
returns are mean zero, i.e. E(δ̃k ) = 0 ∀ k and E(˜ i ) = 0 ∀ i. This sets up the
basic groundwork for the development of APT. No other distributional assumptions
are required at this stage.
The next major concept comes from the market efficiency. First, we define an
arbitrage portfolio η. The components of this portfolio are the dollar amounts invested
in various assets (as opposed to the percentage of wealth in W in the previous sections).
The unique property of the portfolio is that the total dollar amount invested is zero i.e.:

ηT ` = 0 (11.3)

The cashflow as a result of this investment is given by:


˜ + η Tε̃ε
η T R̃ = η T M + η T B∆ (11.4)

The portfolio is called arbitrage portfolio for another reason. It is created in such a way
that it has no factor related or common risk i.e.:

η T B = ZT
k (11.5)

where Z = [0, 0, . . . , 0]T is the k × 1 vector of zeros.


Under this definition of (constraint on) the arbitrage portfolio, the dollar cash flow
is given by:

ηT R̃ = η T M + η Tε̃ε (11.6)

Through equations (11.3) and (11.5) we have imposed k + 1 constraints on our


arbitrage portfolio. The portfolio still has n − k − 1 degrees of freedom. The intuitive

17
idea behind APT is that the investors will choose arbitrage portfolios such that the
idiosyncratic component of the random return (and, therefore, the risk) is minimized.
In the limit, if the number of assets is very large (tending to infinity), this component
will be (tending to) zero. Therefore, we can approximate the cashflow as:

η T R̃ ' η T M (11.7)

The above equation is saying that on the arbitrage portfolio constructed above the
future cashflow is not random. Instead, it is a determinate amount equal to η T M. In
frictionless efficient markets, a zero net position with a zero factor risk and insignificant
(approximately zero) idiosyncratic risk should provide insignificant (approximately
zero) cashflows or else there would be arbitrage opportunities in the economy. Ross
motivates this transition using the law of large numbers. Others have used limit or
sequence economies for the same purpose. As a result equation (11.7) can be written
as:

ηTM ' 0 (11.8)

Now combining equations (11.3), (11.5) and (11.8) one sees that vector η is ex-
actly orthogonal to the k column vectors B and the unit vector ` and is approximately
orthogonal to vector M. If we had exact orthogonality with all the vectors, we would
know that the vectors `, M and the k column vectors of B lie in a k + 1 dimensional
space. As a result any of the vectors could be expressed as an exact linear combination
of the remaining k + 1 vectors. Since, we do not have exact orthogonalities, the linear
combination relationship would only be an approximate one. As a result we write:

M ' γ0 ` + BΓ (11.9)

Here γ0 and Γ = [γ1 γ2 . . . γ3 ]T are constants that satisfy (11.9). In expanded form,
we can write for an asset i:

µi ' γ0 + βi1 γ1 + βi2 γ2 + . . . + βik γk (11.10)

In exact form, we can write it as:

µi = γ0 + βi1 γ1 + βi2 γ2 + . . . + βik γk + ui (11.11)

or,

M = γ0 ` + BΓ + u (11.12)

where ui is the pricing error in the ith asset. Several interpretations of the γs have
been offered by different authors (Ross; Ingersoll; Admati & Pfleiderer). The simplest
of them all is by Ross. Each γk ∀ k is a risk premium demanded by a ‘factor fund’
portfolio. If we were to construct k + 1 funds such that the first fund would be totally
independent of (orthogonal to) the other k and each of the remaining k funds would
exactly mimic the k factors, then γ0 would be the expected return on the first fund
(subscripted zero) while γk ∀ k > 0 would be the risk premia associated with such

18
fund portfolios. The risk premium is defined with respect to γ0 i.e., γk = µk − µ0
where µk is the expected return on the k th fund and µ0 = γ0 . If there is a risk-free
asset in the market then µ0 and γ0 equal the return on the risk-free asset. In this notation
the pricing equation (11.10) can be written as:

µi ' µ0 + βi1 (µ1 − µ0 ) + βi2 (µ2 − µ0 ) + . . . + βik (µk − µ0 ) (11.13)

For testing purposes, the mean-variance models require an expectation model. APT
circumvents that problem. Substitute (11.11) in (11.1) to get:

r̃i = γ0 + βi1 (δ̃1 + γ1 ) + βi2 (δ̃2 + γ2 ) + . . . + βik (δ̃k + γk ) + (˜


i + ui ) (11.14)

or in the matrix form:


˜ + Γ) + (ε̃ε + u)
R̃ = γ0 ` + B(∆ (11.15)

12 A Critical Look At APT


The pricing equation above has some very strong features in its favor. It is not based
on any assumptions about the utility functions of the investors or the probability distri-
bution of returns. It also does not need all the assets in the economy to be included to
conduct a test of the theory. Therefore, it does, at least on the face of it, seem to over-
come the objections to the mean-variance analysis. Before we embark on a detailed
analysis of APT, let us quickly look at the major weaknesses of APT.

(a) The asset pricing relationship (11.11) is an approximate one. The approximation
error is not due to estimation or such other empirical issues. The pricing error is
the recognition of the fact that assets do have non-diversifiable idiosyncratic risk
and investors do expect to be compensated for assuming that risk. As to how small
this pricing error should be will, of course, depend on the characteristics of the
investor and that of the economy. Several authors have provided bounds for this
pricing error (Dybvig; Grinblatt & Titman).
(b) The number of factors is not exactly known. The issue can not be resolved em-
pirically. Any text-book on factor analysis will tell you that in factor extraction
exercise, the number of factors extracted could be a very personal matter. Also,
the number of factors (based on any test of statistical significance) will increase as
the number of variables being analyzed is increased. There is significant amount
of empirical evidence on this issue relating to APT also (Dhrymes, Friend & Gul-
tekin; Trzcinka).
(c) There is also the related issue of the effect of inability to extract all the factors on
the properties of the error component of the return generation process. Essentially,
the errors will indeed have cross-sectional dependencies. It is not clear if the pric-
ing error bound will be as small as it is in the case of perfectly uncorrelated error
terms. Some authors have done work on this problem and have shown APT to be
rigorous under approximate factor structure (Ingersoll; Connor; Chamberlain &
Rothschild).

19
(d) Finally, there is the issue of identification of these ‘pervasive economic factors’.
No major published work has appeared in this area. This is simply because, draw-
ing from the factor analytic research in other academic areas, the task is almost
impossible.

13 Pricing Error
We know that the approximation in (11.8) will be a good one for some investors while
it may not be as good for others depending on their utility functions and wealth levels.
This section provides a unique definition of no-arbitrage and derives an expression for
the pricing error.
Let us denote the utility function of the investor by U (W ). We define the no-
arbitrage condition as certainty equivalent of the random cashflow being equal to zero.
If this condition were to be violated, it would be possible to trade the arbitrage portfolio
for a nonzero value. Therefore,

CE(η T R̃) = 0 (13.1)

But from the definition of certainty equivalence:

U (W0 + CE) = E(U (W0 + η T R̃)) (13.2)

where W0 is the initial total wealth of the investor. Substituting for η R̃ from (11.6)
and using the condition (13.1) we get:

U (W0 ) = E(U (W0 + η T M + ηε̃ε)) (13.3)

realizing that E(η Tε̃ε) = 0 and using Jensen’s inequality and concavity of the utility
function we get:

U (W0 ) ≤ U (W0 + η T M) (13.4)

from the nonsatiation property of the utility function

W0 ≤ W0 + η T M (13.5)

or,

ηTM ≥ 0 (13.6)

This implies that the required (expected) return on the arbitrage portfolio will be pos-
itive, rather than zero. As yet we have no implication for the individual pricing errors
but the equations above can be used to determine a zero error in pricing. An exact
expression for the zero error can be obtained if we assume an exact functional form for
the utility function. Let us take U (W ) = −e(−AW ) then equation (13.3) can be written
as:
T T
−e(−AW0 ) = −e[−A(W0 +η M)]
E(e(−Aη ε)
ε̃
) (13.7)

20
or,
T T
e(Aη M)
= E(e(−Aη ε)
ε̃
) (13.8)
Now, if we assume a specific density function for ε̃ε, we can arrive at an explicit ex-
pression. Let us assume it to be a multivariate normal with mean vector zero and the
variance-covariance matrix Φ, then by the use of the moment generating function for
the normal distribution we get:
T 2
1
η T Φη)
e(Aη M)
= e( 2 A (13.9)
or
1 T
ηTM = Aη Φη (13.10)
2
This can be written as:
1
η T (M − AΦη) = 0 (13.11)
2
Equation (13.11), therefore, should be used in the derivation of the pricing equation
instead of equation (11.8). If we use this equation, we get the following exact pricing
equation:
1
M = γ0 ` + BΓ + AΦη (13.12)
2
or,
1
µi = γ0 + βi1 γ1 + βi2 γ2 + . . . + βik γk + Aη i Var(˜
i ) (13.13)
2
This gives us the pricing error from the usual approximate APT. If the investor is
risk-averse then realizing that Φ is diagonal consistent with the original factor model
with independent error terms, we realize that the pricing error will have the same sign
as the amount of wealth invested in the asset to form the arbitrage portfolio! As we saw
in section 11, n − k − 1 different arbitrage portfolios can be formed by the investors,
and as a result, different investors will have different signs and magnitudes of η i . This
seems to indicate that the pricing error (or the premium demanded for bearing the
idiosyncratic risk) will be different for different investors. Uniformity, and therefore
equilibrium, can be imposed by aggregation or some other assumption. We attempt a
mean-variance approach in section 15.
Alternately, we can derive an expression by using the Taylor expansion.
U (W0 + η T M + η Tε̃ε) = U (W0 ) + (η T M + η Tε̃ε)U 0 (W0 )
1
+ (η T M + η Tε̃ε)2 U 00 (W̃0 )
2
where W̃0 is in (W0 , W0 + η T M + η Tε̃ε). Substituting this in (13.3) and taking ex-
pectations we get:
1 U 00 (W̃0 )
η T M = − η T Φη 0
2 U (W0 )

21
00
(W̃0 )
If we define − UU 0 (W 0)
= Ã we get the same form as (13.10) and the rest of the deriva-
tion is same as above with the difference that the pricing error is now random and the
best we can do is find a bound.
Both Dybvig and Grinblatt & Titman arrive at pricing error bounds under different
market conditions and assumptions . The above pricing error expression has direct
resemblance to these bounds.

14 Estimation and Identification


Let us write (11.2) as:
˜ + ε̃ε
R̃ − M = B∆ (14.1)

so that the variance-covariance matrix:


h i
Σ = E (R̃ − M)(R̃ − M)T
h i
= E (B∆ ˜ + ε̃ε)(B∆
˜ + ε̃ε)T

= ˜ ∆˜T )BT + BE(∆ε̃


BE(∆ ˜ εT ) + E(ε̃ε∆˜T )BT + E(ε̃εε̃εT )
= BΩBT + Φ (14.2)

where E(∆ε̃ ˜ εT ) = E(ε̃ε∆˜T ) = 0 by the basic assumptions of the factor model.


The factor analysis process decomposes the matrix Σ to get the solution [B, Ω, Φ].
As discussed before, there is no constraint on the number of factors (the number of
columns in B). According to the model Φ is diagonal i.e., all pervasive factors are to
be accounted for by Ω.
There are multiple solutions to this decomposition process. Statistical estimation
procedures find only one of these. There are at least two reasons for this identification
problem. They have been discussed by Shanken and Dhrymes, Friend & Gultekin. Let
us study them one at a time:
A solution {B1 , Ω1 , Φ1 } is equivalent to {B2 , Ω2 , Φ2 } if the two solutions are
related as:

B2 = B1 Θ
ε̃ε2 = ε̃ε1
Φ2 = Φ1

where Θ is any k × k nonsingular symmetric matrix. Under this condition ∆ ˜2 =


Θ−1 ∆ ˜ 1 so that both the equations (14.1) and (14.2) are satisfied by both the sets of
solutions. First we show this for (14.1):

R̃2 − M2 = ˜ 2 + ε̃ε2
B2 ∆
= B1 ΘΘ−1 ∆ ˜ 1 + ε̃ε1
= ˜
B1 ∆1 + ε̃ε1
= R1 − M1

22
And now equation (14.2):
T
Σ2 ˜ 2∆
= B2 E(∆ ˜ )BT + Φ2
2 2
˜ T Θ−1 )ΘB1 + Φ1
˜ 1∆
= B1 ΘE(Θ−1 ∆ 1
T
˜ 1∆
= B1 E(∆ ˜ )BT + Φ1
1 1
= Σ1
˜ are unobservable, one solution is indistinguishable from the other. This
Since the ∆s
issue of identification is essentially a scaling problem.
Some order can be instated by arbitrarily specifying some property of ∆. ˜ It is
customary to set the variance-covariance matrix equal to the identity matrix i.e.
˜∆
Ω = E(∆ ˜ T) = I (14.3)
Now this identification problem is resolved because we cannot find Θ 6= I such that:
T T
˜ 2∆
Ω2 = E(∆ ˜ 1∆
˜ ) = ΘE(∆ ˜ )ΘT = ΘIΘT = I
2 1

Under this constraint equation (14.2) can be written as:


Σ = BBT + Φ (14.4)
The problem of identification is not resolved totally yet. A solution [B1 , Ω1 , Φ1 ]
can not be distinguished from another set [B2 , Ω2 , Φ2 ] if there is a k × k1 matrix T
such that:
B2 = B1 T
ε̃ε2 = ε̃ε1
Φ2 = Φ1
and,
TTT = I
˜ 2 = TT ∆
Under this condition ∆ ˜ 1 . To demonstrate this identification problem let us
show the both solutions will satisfy equations (14.1) and (14.4) and the constraint on
Ω, (14.3).
We first show that the equivalence of (14.1):
R̃2 − M2 = ˜ 2 + ε̃ε2
B2 ∆
= B1 TTT ∆ ˜ 1 + ε̃ε1
= ˜ 1 + ε̃ε1
B1 ∆
= R̃1 − M1
Next, (14.3):

Ω2 = E(∆ ˜ T)
˜ 2∆
2
T
˜ 1∆
= TE(∆ ˜ )TT
1
= TITT
= I

23
And finally, (14.4):

Σ2 = B2 B T
2 + Φ2
= B1 TTT BT
1 + Φ1
= B1 B T
1 + Φ1
= Σ1

This problem is in the nature of rotation of the factors. It is not clear if a property
of B or another property of ∆˜ can be specified to control for this problem.

15 Arbitrage Portfolio Selection


In section 11 it was mentioned that while selecting the arbitrage portfolio only the
k + 1 degrees of freedom of the portfolio were used. The rest would be used by the
investor to form the arbitrage portfolio such that the idiosyncratic component of the
risk is minimized. In this section we will attempt to address this issue because this may
help shed some more light to the issue of pricing error.
While deriving the expression for pricing error, we assumed that the probability dis-
tribution for the idiosyncratic component of the error is normal. Under this assumption,
mean variance analysis is sufficient. Therefore, we will assume that while choosing the
arbitrage portfolio, the investor minimizes the idiosyncratic variance subject to other
constraint on the arbitrage portfolio, i.e.:
Min
η T Φη
η
s.t. η T B = ZT
k
ηT` = 0

If we solve the problem as stated here, it can be easily verified that the only solution
is the trivial solution η = Zn . The way to prevent this from happening is to force the
norm of η to be nonzero i.e. add one more constraint η T η = 1 where the norm
has been arbitrarily scaled to unity. This quadratic problem, however, is extremely
difficult to solve. Therefore, we devise another simpler, linear constraint. As long
as we can force one element of the η to be nonzero, we have a non-trivial solution
at hand. However, it should be realized that by doing this we may be obtaining a
minimum under an artificial constraint. It is quite likely that forcing another element
to be non-zero would have led the arbitrarily set element to be zero and a ‘better’
solution. Therefore, the minimization process will have to be done in two steps. First,
find minimum variance by forcing one element of η to be non-zero. In the second step,
choose the global minimum from among all these solutions. Therefore, we impose the
additional constraint:

η T ei = 1

where ei is a n×1 unit vector with 0 in all positions but the ith which has a 1. The non-
zero element of η has been arbitrarily scaled to be 1. The Lagrangian for the problem

24
is:
Min
η T Φη − (η T B − ZT T T
k )λ1 − (η `)λ2 − (η ei − 1)λ3
η
λ1 is a k × 1 vector while λ2 and λ3 are scalars. The first order conditions are:
T
2η T Φ − λT T T
1 B − λ2 ` − λ3 ei = ZT
n (15.1)
ηTB = ZT
k (15.2)
ηT` = 0 (15.3)
η T ei = 1 (15.4)

Equation (15.1) gives us:


1 T T
ηT = (λ B + λ2 `T + λ3 eT
i )Φ
−1
(15.5)
2 1
Post-multiply (15.5) by B, ` and ei respectively to get:

η T B = λT T −1
1B Φ B + λ2 `T Φ−1 B + λ3 eT
i Φ
−1
B = ZTk (15.6)
T T T −1 T −1 T −1
η ` = λ1 B Φ ` + λ2 ` Φ ` + λ3 ei Φ ` = 0 (15.7)
T −1
η T ei = λT
1B Φ ei + λ2 `T Φ−1 ei + λ3 eT
i Φ
−1
ei = 2 (15.8)

Now, as in section 3, define:

A = `T Φ−1 B
[1×k]

AT = BT Φ−1 `
[k×1]

B = BT Φ−1 B
[k×k]

C = `T Φ−1 `
[1×1]
and also,
−1
Ei = eT
i Φ B
[1×k]

EiT = BT Φ−1 ei
[k×1]

Fi = eT
i Φ
−1
` = `T Φ−1 ei
[1×1]
−1
Gi = eT
i Φ ei
[1×1]

where [· · · ] denotes the dimensionality of the variables. Now equations (15.6) through
(15.8) can be written as:

λT
1 B + λ2 A + λ3 Ei = ZT
k (15.9a)
λT T
1 A + λ2 C + λ3 Fi = 0 (15.9b)
λT T
1 Ei + λ2 Fi + λ3 Gi = 2 (15.9c)

25
To solve these equations, it will be convenient to write them as:

B T λ1 + AT λ2 + EiT λ3 = Zk (15.10a)
Aλ1 + Cλ2 + Fi λ3 = 0 (15.10b)
Ei λ1 + Fi λ2 + Gi λ3 = 2 (15.10c)

or in the matrix form:


     
BT AT EiT λ1 Zk
 A C Fi  ·  λ2  =  0 
Ei Fi Gi λ3 2

The solution to these equations is obtained using the Cramer’s rule as:
T
B Zk EiT

A 0 Fi

Ei 2 Gi
λ2 = (15.11)
T DT
B A Zk

A C 0

Ei Fi 2
λ3 = (15.12)
D
where D is the determinant:
T
B AT EiT

D = A C Fi

Ei Fi Gi

and λ1 is determined from (15.9a) as:

λ1 = B −1 (Zk − AT λ2 − EiT λ3 ) (15.13)

These then give us the particular solution η i by substituting in (15.5). Here the sub-
script i denotes that the solution is for the particular arbitrary ei . The global solution
would be the one with the lowest variance among all the solutions, i.e.

η = ηi s.t. ηT
i Φη is minimized. (15.14)

26

Vous aimerez peut-être aussi