Y Hilpisch Python For Finance 06072012 EuroPython NoSo

Python for Finance
Dr. Yves J. Hilpisch
06 July 2012
Training at EuroPython 2012 Conference
Finance, Derivatives Analytics & Python Programming
Visixion GmbH
Y. Hilpisch (Visixion GmbH)
Python for Finance
EuroPython, Florence, 2012
1 / 96
1 2
Useful
Python
Libraries
Array operations with NumPy Basic Array Operations Random Numbers 2d Plotting Exercise: NumPy in Action Selected Financial Topics Approximation Optimization Numerical Integration Case Study: Numerical Option Pricing Binomial Option Pricing Model Python Implementations Monte Carlo Approach Time Series Analysis with pandas Series class DataFrame class Plotting with pandas Exercise: pandas in Action Case Study: Analyzing Stock Quotes with Fast Data Mining with PyTables Introductory PyTables Example Exercise: PyTables with pandas Case Study: Simulation Research Project The Financial Model Python and PyTables Implementation Speeding-up Code with Cython Fundamentals about Cython Example Code for Cython Use Conclusion
6 7 8 9 10
pandas
Python for Finance
2 / 96
Guidelines for the Training
1 2 3
The training addresses a number of important
However, it cannot be exhaustive with regard to We intend to provide an overview which allows
Python issues and libraries for Finance Python's capabilities in this eld
to get an impression of Python's advantages for Finance further self-study
4 5
The majority of the content is included in the slides However, we will strive to provide hands-on experience through interactive parts
Python for Finance
3 / 96
Throughout the training: Results matter more than Style

Bruce LeeThe Tao of Jeet Kune Do: There is no mystery about my style. My movements are simple, direct and non-classical. The extraordinary part of it lies in its simplicity. Every movement in Jeet Kune Do is being so of itself. There is nothing articial about it. I always believe that the easy way is the right way.
The Tao of
My
Python:
There is no mystery about my style. My lines of code are simple, direct and non-classical. The extraordinary part of it lies in its simplicity. Every line of code in my Python is being so of itself. There is nothing articial about it. I always believe that the easy way is the right way.
Python for Finance
4 / 96
Some of the most important Python libraries for Finance projects

NumPy: array operations SciPy: scientic library matplotlib: 2d and 3d plotting pandas: time series and panel data PyTables: database optimized for fast I/O Cython: C extensions for Python IPython: popular interactive Python shell
Useful Python Libraries
operations
Python for Finance
5 / 96
Some recommended Python resources

www.python.org: the Python home www.scipy.org: SciPy and NumPy home http://matplotlib.sourceforge.org: home of the matplotlib plotting http://pandas.paydata.org: home of pandas www.pytables.org: home of PyTables www.cython.org: home of Cython http://ipython.org/: home of the IPython interactive shell http://scipy-lectures.github.com: tutorial notes with scientic focus www.scipy.org/Topical_Software: Python software/libs by topic
library
Python for Finance
6 / 96
Major benets of Python for Finance applications

multi-purpose:
in contrast to special purpose packages,
Python
is general or
multi-purposeyou can use it for everything (synergies)
high productivity:
domain of
Python;
prototyping and rapid application development are an original with
Python,
you can re-use rst code versions when migrating
to standard development cycle
easy-to-maintain:
understand
Python
due to its compactness and readability, team members can easily code from others
low cost:
open source, easy-to-learn, easy-to-code, easy-to-maintain, only a fraction
of the code of other languages needed, multi-platform
future-proof :
huge development eorts around the world, growing number of
successful scientic and corporate applications, growing number of
Python
experts
good performance:
compiled languages
using the right libraries ensures execution speed comparable to
Python for Finance
7 / 96
Visixion's experience with Python

DEXISION:
delivered On Demand (since 2006,
full-edged Derivatives Analytics suite implemented in
www.dexision.com)
Python
and
research: Python
www.visixion.com) trainings: Python trainings

services industry
used to implement a number of numerical research projects (see
with focus on Finance for clients from the nancial
client projects: Python used to implement client specic nancial applications teaching: Python used to implement and illustrate nancial models in derivatives
course at Saarland University (see Course Web Site)
talks:
we have given a number of talks at Python conferences about the use of Python for Finance book: Python used to illustrate nancial models in our recent bookDerivatives Analytics with PythonMarket-Based Valuation of European and American Stock Index Options
Python for Finance
8 / 96
Arrays (I)
NumPy
the convenience of
Array operations with NumPy
Basic Array Operations
is a powerful library that allows array manipulations (linear algebra) in a compact
form and at high speed. The speed comes from the implementation in C. So you have
Python
combined with the speed of C when doing array operations.
>>> from numpy import * >>> a=arange(0.0,20.0,1.0) >>> a array([ 0., 1., 2., 3., 4., 5., 6., 11., 12., 13., 14., 15., 16., 17., >>> a.resize(4,5) >>> a array([[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [ 10., 11., 12., 13., 14.], [ 15., 16., 17., 18., 19.]]) >>> a[0] array([ 0., 1., 2., 3., 4.]) >>> a[3] array([ 15., 16., 17., 18., 19.]) >>> a[1,4] 9.0 >>> a[1,2:4] array([ 7., 8.]) >>>
7., 8., 18., 19.])
9., 10.,
Care is to be taken with the conventions regarding array indices. The best way to learn these is to play with arrays.
Python for Finance
9 / 96
Arrays (II)
With
NumPy,
array operations are as easy as operations on integers or oats.
>>> a*0.5 array([[ 0. , [ 2.5, [ 5. , [ 7.5, >>> a**2 array([[ 0., [ 25., [ 100., [ 225., >>> a+a array([[ 0., [ 10., [ 20., [ 30., >>>
0.5, 3. , 5.5, 8. ,
1. , 3.5, 6. , 8.5,
1.5, 4. , 6.5, 9. ,
2. ], 4.5], 7. ], 9.5]]) 16.], 81.], 196.], 361.]])
1., 4., 9., 36., 49., 64., 121., 144., 169., 256., 289., 324., 2., 4., 12., 14., 22., 24., 32., 34.,
6., 8.], 16., 18.], 26., 28.], 36., 38.]])
Python for Finance
10 / 96
Looping over arrays

follows:
Sometimes you need to loop over arrays to check something. Looping is easily done as
>>> b=arange(0.0,25.1,0.5) >>> b array([ 0. , 0.5, 1. , 4.5, 5. , 5.5, 9. , 9.5, 10. , 13.5, 14. , 14.5, 18. , 18.5, 19. , 22.5, 23. , 23.5, >>> for i in range(50): if b[i]==15.0: print "15.0 at
1.5, 6. , 10.5, 15. , 19.5, 24. ,
2. , 6.5, 11. , 15.5, 20. , 24.5,
2.5, 3. , 3.5, 4. , 7. , 7.5, 8. , 8.5, 11.5, 12. , 12.5, 13. , 16. , 16.5, 17. , 17.5, 20.5, 21. , 21.5, 22. , 25. ])
index no.", i
15.0 at index no. 30 >>> for i in enumerate(b[0:6]): print i, (0, 0.0) (1, 0.5) (2, 1.0) (3, 1.5) (4, 2.0) (5, 2.5) >>>
The use of
arange
and
range
should be obvious. The rst can produce arrays of oat
type while the latter can only generate integers; and indices of arrays are always integers that is why we loop over integers and not over oats or something else.
Python for Finance
11 / 96
Random Numbers
Random Numbers
Sciences and Finance cannot live without random numbers, be them either pseudo-random or quasi-random.
NumPy
has built in convenient functions for the
generation of pseudo-random numbers in the
NumPy
sub-module
random.
>>> from numpy.random import * >>> a=random(20) >>> a array([ 0.66064392, 0.4315458 , 0.70880114, 0.00276342, 0.83383503, 0.24952601, 0.04636591, 0.10729739, 0.19072693, 0.82089409, 0.29784537, 0.35496562, 0.546188 , 0.52711541, 0.07060185, 0.60602829, 0.91907393, 0.52241082, 0.07597062, 0.27253169]) >>> b=standard_normal((4,5)) >>> b array([[-0.59317286, 0.27533818, -0.46122351, -0.05138033, -1.8371135 ], [-1.15520074, 1.04980946, 0.31082909, 0.32662006, -0.36752163], [ 0.66452767, -0.88077193, 1.18253972, 0.16836824, -1.40541028], [ 0.01481426, -0.88137549, 0.74594197, -0.97360666, -0.77270426]]) >>> c=random((2,3,4)) >>> shape(c) (2, 3, 4) >>> c array([[[ 0.09864194, 0.76069475, 0.54398641, 0.73081207], [ 0.81036431, 0.24343805, 0.38178278, 0.9414989 ], [ 0.0533329 , 0.0346994 , 0.67048989, 0.99188034]], [[ 0.27786962, 0.87359556, 0.14993006, 0.20461863], [ 0.59543661, 0.24566182, 0.47176266, 0.3328179 ], [ 0.8340118 , 0.96561975, 0.17854239, 0.81699292]]])
>>>
Python for Finance
12 / 96
2d plotting (I)
module
2d Plotting
More often than not, one wants to visualze results from calculations or simulations. The
matplotlib
is quite powerful when it comes to 2d visualizations of any kind.
The most important types of graphics in general are lines, dots and bars.
>>> b=standard_normal((4,5)) >>> b array([[-0.57180547, -1.32783183, -0.27474264, 0.6301795 , 0.71101905], [ 0.29724602, 0.289595 , 0.1056877 , 0.06424294, -0.35708164], [ 0.25890926, 0.79000265, -0.47457278, 0.11719325, 0.39121246], [-0.24544426, 1.59194504, -1.6703606 , -0.00169267, -0.63803156]]) >>> from matplotlib.pyplot import * >>> plot(b) [<matplotlib.lines.Line2D object at 0x2b9e790>] >>> grid(True) >>> axis('tight') (0.0, 50.0, 0.0, 25.0) >>> show()
Python for Finance
13 / 96
2d plotting (II)
2d Plotting
The gure below shows the output. Notice that
dierent values eachwhich is due to the array size of
matplotlib produces 4 5.
ve lines with four
1.5 1.0 0.5 0.0 0.5 1.0 1.5 0.0
0.5
1.0
1.5
2.0
2.5
3.0
Figure: Example of gure with matplotlibhere: lines

Python for Finance
14 / 96
2d plotting (III)
2d Plotting
The next example combines a dot sub-plot with a bar sub-plot the result of which is shown in the next gure. Here, due to resizing of the array we have only a one-dimensional set of numbers.
>>> d=standard_normal((4,5)) >>> d=resize(d,20) >>> d array([ 0.12709036, -1.19800928, 0.22527268, 0.39149983, 0.19080228, 0.57113933, -1.07355946, 0.8428513 , -2.22197056, 1.58069866, 0.6992034 , -1.45520777, 0.42116251, -0.26856476, 1.09870092, 0.83489701, -2.34729449, -0.58642723, 0.34725616, -0.56177434]) >>> subplot(211) <matplotlib.axes.AxesSubplot object at 0x3585590> >>> plot(d,'ro') [<matplotlib.lines.Line2D object at 0x3565490>] >>> subplot(212) <matplotlib.axes.AxesSubplot object at 0x3585250> >>> bar(range(20),d) [<matplotlib.patches.Rectangle object at ...] >>> grid(True) >>> show()
Python for Finance
15 / 96
2d plotting (IV)
2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.50 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.50
2d Plotting
10
15
20
10
15
20
Figure: Example of gure with matplotlibhere: dots & bars

Python for Finance
16 / 96
Let's implement some array manipulations with NumPy

1
Exercise: NumPy in Action
We generate two series with 50 standard normal (pseudo-)random numbers (array of size
(50, 2),
call it
rn)
Calculate the sum of the two 50 number vectorsonce vector-wise and once using the
sum
functionand store it in another array; call it
Calculate the cumulative sum (index by index) of called
rn_cum;
once using the
cumsum
rn_sum rn_sum and store
it in an array
function, once by iterating over all items in
the vector
4 5
Plot the
rn_cum
vector
Plot a histogram for each of the two 50 number vectors of subplots)
rn
into a single gure (as
Python for Finance
17 / 96
Regression (I)
conclusions.
Selected Financial Topics
Approximation
It is often the case in Finance that one has to approximate something to draw
Two important approximation techniques are regression and interpolation. The type of regression we consider rst is called ordinary least squares regression (OLS). In its most simple form, ordinary monomials desired function
x, x2 , x3 , ...
are used to approximate a
y = f (x) given (y1 , x1 ), (y2 , x2 ), ..., (yN , xN ).
a number
of obervations
Say we want to approximate
f (x)
with a polynomial of order 2
g (x) = a1 + a2 x + a3 x2
where the
ai
are regression parameters.
The task is then to

N a1 ,a2 ,a3
min
(yn g (xn ; a1 , a2 , a3 ))2

n
Python for Finance
18 / 96
Regression (II)
Approximation
As an example, we want to approximate the cosine function over the interval given 20 observations.
[0, /2]
The code (see next slide) is straightforward since NumPy has built-in functions polyfit and polyval. From polyfit you get the minimizing regression parameters back, while you can them with polyval to generate values based on these parameters. The result is shown in the next gure for three dierent regression functions.
use
Python for Finance
19 / 96
Regression (III)
Approximation
# # Ordinary Least Squares Regression # a_REG . py # from numpy import * from matplotlib . pyplot import * # Regression x = linspace (0.0 , pi /2 , 20 ) y = cos (x ) g1 = polyfit (x ,y ,0 ) g2 = polyfit (x ,y ,1 ) g3 = polyfit (x ,y ,2 ) g1y = polyval ( g1 , x) g2y = polyval ( g2 , x) g3y = polyval ( g3 , x) # Graphical Analysis plot (x ,y , ' y ' ) plot (x , g1y , ' rx ' ) plot (x , g2y , ' bo ' ) plot (x , g3y , ' g > ' )
Python for Finance
20 / 96
Regression (IV)
1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.0
Approximation
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
Figure: Approximation of cosine function (line) with constant regression (red crosses), linear regression (blue dots) and quadratic regression (green triangles)
Python for Finance
21 / 96
Splines (I)
Approximation
The concept of interpolation is much more involved but nevertheless almost as straightforward in applications. The most common type of interpolation is with cubic splines for which you nd functions in the sub-module
scipy.interpolate.
The example remains the same and the code (see next slide) is as compact as before while the resultsee the respective gureseems almost perfect. Roughly speaking, cubic splines interpolation is (intelligent) regression between every two observation points with a polynomial of order 3. This is of course much more exible than a single regression with a polynomial of order 2. Two drawbacks in algorithmic terms are, however, that the observations have been ordered in the x-dimension. Furthermore, cubic splines are of limited or no use for higher dimensional problems where OLS regression is applicable as easy as in the two-dimensional world.
Python for Finance
22 / 96
Splines (II)
Approximation
# # Cubic Spline Interpolation # b_SPLINE . py # from numpy import * from scipy . interpolate import * from matplotlib . pyplot import * # Interpolation x = linspace (0.0 , pi /2 , 20 ) y = cos (x ) gp = splrep (x ,y ,k= 3) gy = splev (x ,gp , der =0 ) # Graphical Analysis plot (x ,y , ' b ' ) plot (x ,gy , ' ro ' )
Python for Finance
23 / 96
Splines (III)
1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.0
Approximation
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
Figure: Approximation of cosine function (line) with cubic splines interpolation (red dots)
Python for Finance
24 / 96
Optimization Methods (I)
Optimization
Strictly speaking, regression and interpolation are two special forms of optimization (some kind of minimization). However, optimization techniques are needed much more often in science and nance. An important area is, for example, the calibration of derivatives model parameters to a given set of market-observed option prices or implied volatilities. The two major approaches are global and local optimization. While the rst looks for a global minimum or maximum of a function (which does not have to exist at all), the second looks for a local minimum or maximum.
Python for Finance
25 / 96
Optimization Methods (II)

function value of
Optimization
As an example, we take the sine function over the intervall
[, 0]
with a minimum
at
/2.
delivers respective functions via the sub-module
Again, the module
scipy
optimize.
# # Finding a Minimum # c_OPT . py # from numpy import * from scipy . optimize import * # Finding a Minimum def y(x ): if x <- pi or x > 0: return 0 .0 return sin (x ) gmin = brute (y ,(( - pi ,0 ,0. 01 ) ,) , finish = None ) lmin = fmin (y ,- 0.5 ) # Output print " Global Minimum is " , gmin print " Local Minimum is " , lmin
Python for Finance
26 / 96
Optimization Methods (III)

Both functions
Optimization
brute
(global brute force algorithm) and
fmin
(local convex
optimization algorithm) also work in multi-dimensional settings. In general, the solution of the local optimization is strongly dependent on the initialization. Here the
0.5
did quite well in reaching
/2
as the solution.
>>> ================================ RESTART ================================ >>> Optimization terminated successfully. Current function value: -1.000000 Iterations: 18 Function evaluations: 36 Global Minimum is -1.57159265359 Local Minimum is [-1.57080078] >>>
Python for Finance
27 / 96
Numerical Integration
Numerical Integration
It is not always possible to analytically integrate a given function. Then numerical integration often comes into play. We want to check numerical integration where we can do it analytically as well: 1 x e dx 0 1 0 The value of the integral is e e 1.7182818284590451. For numerical integration, again the function
scipy helps out with the sub-module integrate quad, implementing a numerical quadrature scheme:
which contains
# # Numerically Integrate a Function # d_INT . py # from numpy import * from scipy . integrate import * # Numerical Integration def f(x ): return exp (x ) Int = quad ( lambda u: f(u ),0 ,1 )[ 0] # Output print " Value of the Integral is " , Int
Python for Finance
28 / 96
The CRR model (I)
Case Study: Numerical Option Pricing
Binomial Option Pricing Model
To better understand how to implement the binomial option pricing model of Cox, Ross and Rubinstein (1979, henceforth: CRR), a little more background seems helpful. There are two securities traded in the model: a risky stock index and a risk-less zero-coupon bond. The time horizon
[0, T ]
is divided in equidistant time intervals
so that one gets
T /t + 1
points in time
t {0, t, 2 t, ..., T }. r,
the stock
The zero-coupon bond grows p.a. in value with the risk-less short rate
Bt = B0 ert
where
B0 > 0. S0
at
Starting from a strictly positive, xed stock index level of index evolves according to the law
t = 0,
St+t St m
where Here,
m 0
is selected randomly from {u, d}. < d < ert < u e t as well as
1 as a simplication which leads to d
a recombining tree.
Python for Finance
29 / 96
The CRR model (II)
Assuming risk-neutral valuation holds, the following relationship can be derived
St
= =
ert EQ t [St+t ] ert (quSt + (1 q )dSt )
Against this background, the risk-neutral (or martingale) probability is
q=
The value of a European call option payos
ert d ud
discounting the nal
C0 is then obtained by CT (ST , K ) max[ST K, 0] at t = T to t = 0: C0 = erT EQ 0 [C T ]
The discounting can be done step-by-step and node-by-node backwards starting at
t = T t .
Python for Finance
30 / 96
The CRR model (III)
From an algorithmical point of view, one has to rst generate the index level values, determines then the nal payos of the call option and nally discounts them back. This is what we now will do, starting with a somewhat `naive' implementation. But before we do it, we generate a
Python
module which contains all parameters
that we will need for dierent implementations afterwards. All parameters can be imported by using the module name is
import
lename without the sux `.py' (i.e. the lename is
a_Parameters).
a_Parameters.py
command and the respective and the
Python for Finance
31 / 96
The CRR model (IV)
# # Model Parameters for European Call Option and Binomial Models # a_Parameters . py # from math import exp , sqrt # Option Parameters s0 = 105 . 00 # Initial Index Level K = 100 . 00 # Strike Level T = 1. # Call Option Maturity r = 0 . 05 # Constant Short Rate vola = 0 . 25 # Constant Volatility of Diffusion # Time Parameters t = 3 # Time Intervals delta = T/ t # Length of Time Interval df = exp (-r * delta ) # Discount Factor # u d q Binomial Parameters = exp ( vola * sqrt ( delta )) = 1 /u = ( exp ( r* delta )-d )/( u -d ) # Up - Movement # Down - Movement # Martingale Probability
The next slide presents the rst version of the binomial model which uses Excel-like cell iterations extensively. We will see that there are ways to a more compact and faster implementation.
Python for Finance
32 / 96
Python Implementations
# # Valuation of European Call Option in CRR1979 Model # Naive Version (= Excel - like Iterations ) # b_CRR1979_Naive . py # from numpy import * from a_Parameters import * # Array Initialization for Index Levels s = zeros (( t+1 ,t+1) , ' float ' ) s[0 ,0] = s0 z = 0 for j in range (1 ,t+1 , 1 ): z = z+1 for i in range (z +1 ): s[i , j] = s[0 , 0 ]*( u ** j )*( d **( i*2 )) # Array Initialization for Inner Values iv = zeros (( t+1 ,t +1), ' float ' ) z = 0 for j in range (0 ,t+1 , 1 ): for i in range (z +1 ): iv [i ,j] = round ( max (s[i ,j]-K , 0),8) z = z+1 # Valuation pv = zeros (( t+1 ,t +1), ' float ' ) # Present Value Array pv [: , t] = iv [: , t] # Last Time Step Initial Values z = t+1 for j in range (t -1 ,-1 ,-1 ): z = z -1 for i in range (z ): pv [i ,j] = ( q* pv [i ,j+1 ]+( 1 -q )* pv [i+1 ,j+1 ])* df # Output print " Value of European call option is " , pv [0 ,0]
Python for Finance
33 / 96
Ouput of rst implementation

>>> Value of >>> s array([[ [ [ [ >>> iv array([[ [ [ [ >>> pv array([[ [ [ [ >>>
A run of the module gives the following output and arrays where one can follow the three steps easily (index levels, inner values, discounting):
European call option is 16.2929324488 105. 0. 0. 0. 5. 0. 0. 0. , 121.30377267, 140.13909775, , 90.88752771, 105. , , 0. , 78.67183517, , 0. , 0. , 161.89905958], 121.30377267], 90.88752771], 68.09798666]])
, 21.30377267, 40.13909775, 61.89905958], , 0. , 5. , 21.30377267], , 0. , 0. , 0. ], , 0. , 0. , 0. ]])
16.29293245, 26.59599847, 41.79195237, 61.89905958], 0. , 5.61452766, 10.93666406, 21.30377267], 0. , 0. , 0. , 0. ], 0. , 0. , 0. , 0. ]])
Python for Finance
34 / 96
Somewhat better implementation (I)

# # Valuation of European Call Option in CRR1979 Model # Advanced Version (= NumPy Iterations ) # c_CRR1979_Advanced . py # from numpy import * from a_Parameters import * # Array Initialization for Index Levels mu = arange (t+1) mu = resize (mu ,( t +1 ,t+1 )) md = transpose ( mu ) mu = u **( mu - md ) md = d ** md s = s0 * mu * md
Our alternative version makes more use of the capabilities of
NumPy
the consequence is
more compact code even if it is not so easy to read in a rst instance.
# Valuation pv = maximum (s -K , 0) Qu = zeros (( t+1 ,t +1), ' float ' ) Qu [: ,:] = q Qd = 1 - Qu z = 0 for i in range (t -1 ,-1 ,-1 ): pv [0:t -z ,i] = ( Qu [ 0:t -z , i ]* pv [ 0:t -z , i+1 ]+ Qd [0:t -z ,i ]* pv [1:t -z+1 ,i +1 ])* df z = z+1 # Output print " Value of European call option is " , pv [0 ,0]
Python for Finance
35 / 96
Somewhat better implementation (I)
The valuation result is, as expected, the same for the parameter denitions from before. However, three time intervals are of course not enough to come close to the Black-Scholes benchmark of 15.6547197268. With 1,000 time intervals, however, the algorithms come quite close to it:
>>> ================================ RESTART ================================ >>> Value of European call option is 15.6537846075 >>>
Python for Finance
36 / 96
Comparison
The major dierence between the two algorithms is execution time. The second implementation which avoids about 30 times faster than the rst one. You should make this a principle for your own coding eorts: whenever possible avoid necessary iterations in
Python
iterations as much as possible is
Python
and delegate them to
NumPy.
Apart from time savings, you generally also get more compact and readable code. A direct comparison illustrates this point:
# Naive Version --- Iterations in Python # # Array Initialization for Inner Values iv = zeros((t+1,t+1),'float') z = 0 for j in range(0,t+1,1): for i in range(z+1): iv[i,j] = max(s[i,j]-K,0) z = z+1 # Advanced Version --- Iterations with NumPy/C # pv = maximum(s-K,0)
Python for Finance
37 / 96
FFT version of the CRR model (I)

the binomial model.
To conclude this section, we apply the Fast Fourier Transform (FFT) algorithm to
Nowadays this numerical routine plays a central role in Derivatives Analytics and other areas of science. It is used regularly for plain vanilla option pricing in productive environments in investment banks or hedge funds. In general, however, it is not applied to a binomial model but the application in this case is straightforward and therefore a quick win for us. In this module (see next slide),
Python
iterations are all avoidedthis is possible
since for European options only the nal payos are relevant. The speed advantage of this algorithm is again considerable: it is 30 times faster than our advanced algorithm from before and 900 times faster than the naive version.
Python for Finance
38 / 96
FFT version of the CRR model (II)
# # Valuation of European Call Option in CRR1979 Model # FFT Version # d_CRR1979_FFT . py # from numpy import * from numpy . fft import fft , ifft from a_Parameters import * # Array Generation for Index Levels md = arange (t+ 1) mu = resize ( md [-1] ,t+ 1) mu = u **( mu - md ) md = d ** md s = s0 * mu * md # Valuation by FFT C_T = maximum (s -K ,0) Q = zeros (t+1 , ' float ' ) Q[ 0] = q Q[ 1] = 1 - q l = sqrt ( t+1 ) v1 = ifft ( C_T )* l v2 = ( sqrt ( t+1 )* fft (Q )/( l *( 1 +r* delta )))** t C_0 = fft ( v1 * v2 )/ l # Output print " Value of European call option is " , real ( C_0 [0 ])
Python for Finance
39 / 96
Monte Carlo Simulation (I)

option.
Monte Carlo Approach
Finally, we apply Monte Carlo simulation (MCS) to value the same European call
Here it is where pseudo-random numbers come into play. Similarly to the FFT algorithm we only care about the nal index level at simulate it with pseudo-random numbers. We get the following simple simulation algorithm:
and
consider the date of maturity T and write start iterating i = 1, 2, ..., I

T
ST = S0 e(r 2
)T +
T wT
(1)
T T
iterate until i = I sum up all inner values at T , take the average and discount back to t = 0:
T
draw a standard normally distributed pseudo-random number w (i) determine at T the index level S (i) by applying the number w (i) to equation (1) determine the inner value of the call at T as max[S (i) K, 0]
C0 (K, T ) erT 1 I max[ST (i) K, 0]
I
this is the MCS estimator for the European call option value
Python for Finance
40 / 96
Monte Carlo Simulation (II)

The
Although the word `iterating' sounds like looping over arrays we can again avoid array loops completely on the
Python/NumPy
Python
level.
implementation is really compactonly 5 lines of code for the
core algorithm With another 5 lines we can produce a histogram of the index levels at displayed in the respective gure.
as
# # Valuation of European Call Option via Monte Carlo Simulation # g_MCS . py # from numpy . random import * from matplotlib . pyplot import * from a_Parameters import * from numpy import * # Valuation via MCS paths = 500000 rand = standard_normal ( paths ) sT = s0 * exp (( r -0.5 * vola ** 2 )* T+ sqrt ( T )* vola * rand ) pv = sum ( maximum (sT -K ,0 )* exp (-r *T ))/ paths print " Value of European call option is " , pv # Graphical Analysis figure () hist ( sT , 100 ) xlabel ( ' index level at T ' ) ylabel ( ' frequency ' ) show ()
Python for Finance
41 / 96
Monte Carlo Simulation (III)

20000
15000
frequency
10000
5000
0 50
50
100 150 index level at T
200
250
Figure: Histogram of simulated stock index levels at T

Python for Finance
42 / 96
Monte Carlo Simulation (IV)
The algorithm produces quite an accurate estimate for the European call option value although the implementation is rather simplistic (i.e. there are, for example, no variance reduction techniques involved):
>>> ================================ RESTART ================================ >>> Value of European call option is 15.6306695905 >>>
Python for Finance
43 / 96
A fundamental class in pandas is the Series class (I)

The If A
Time Series Analysis with pandas
Series class
Series class is explicitly designed to handle indexed (time) series1 s is a Series object, s.index gives its index simple example is s=Series([1,2,3,4,5],index=['a','b','c','d','e'])
In [16]: s=Series([1,2,3,4,5],index=['a','b','c','d','e']) In [17]: s Out[17]: a 1 b 2 c 3 d 4 e 5 In [18]: s.index Out[18]: Index([a, b, c, d, e], dtype=object) In [19]: s.mean() Out[19]: 3.0 In [20]:
There are lots of useful methods in the
Series
class ...
The major pandas source is http://pandas.pydata.org

Python for Finance
44 / 96
A fundamental class in pandas is the Series class (II)

A major strength of dates and times An simple example using the management
Series class
pandas
is the handling of time series data, i.e. data indexed by
DateRange
function shall illustrate the time series
In [3]: x=standard_normal(250) In [4]: index=DateRange('01/01/2012',periods=len(x)) In [5]: s=Series(x,index=index) In [6]: s Out[6]: 2012-01-02 2012-01-03 2012-01-04 2012-01-05 ...
1.06959238875 0.794515407245 -1.01590534404 -0.751618588824
Python for Finance
45 / 96
Another fundamental class in pandas is the DataFrame class

This class's intellectual father is the language/package The
DataFrame class
data.frame
class from the statistical
indexed
DataFrame
class is explicitly designed to handle
multiple,
maybe
hierarchically
class,
(time) series
The following example illustrates some convenient features of the i.e. data alignment and handling of missing data
DataFrame
In [35]: s=Series(standard_normal(4),index=['1','2','3','5']) In [36]: t=Series(standard_normal(4),index=['1','2','3','4']) In [37]: df=DataFrame({'s':s,'t':t}) In [38]: df['SUM']=df['s']+df['t'] In [39]: print df.to_string() s t SUM 1 -0.125697 0.016357 -0.109340 2 0.135457 -0.907421 -0.771964 3 1.549149 -0.599659 0.949491 4 NaN 0.734753 NaN 5 -1.236310 NaN NaN In [40]: df['SUM'].mean() Out[40]: 0.022728863312009556
Python for Finance
46 / 96
The two main pandas classes have methods for easy plotting
The
Plotting with pandas
Series
and
DataFrame
classes have methods to easily generate plots
The two major methods are
plot
and
hist
Again, an example shall illustrate the usage of the methods
In [54]: index=DateRange(start='1/1/2013',periods=250) In [55]: x=standard_normal(250) In [56]: y=standard_normal(250) In [57]: df=DataFrame({'x':x,'y':y},index=index) In [58]: df.cumsum().plot() Out[58]: <matplotlib.axes.AxesSubplot at 0x3082c10> In [59]: df['x'].hist() Out[59]: <matplotlib.axes.AxesSubplot at 0x3468190> In [60]:
Python for Finance
47 / 96
The results of which can then be saved for further use
Plotting with pandas
Figure: Example plots with pandas
Python for Finance
48 / 96
To begin with, we want to use pandas with dummy data

1
Exercise: pandas in Action
We generate three series with 100 standard normal (pseudo-)random numbers (array of size
(100, 3)).
We use the array to ll a
DataFrame
object, generate an index starting on 01 Jan
2013 with 30-day steps and give the three series the names
3
['A','B','C']
We then generate a 4-th series with name sum of the three other series.
CUMSUM
which contains the cumulative
4 5 6
We plot the 4th series with the built-in method
We also generate a histogram for the 3rd series with We save the histogram as PDF le.
plot. matplotlib
and set
bins=20.
Python for Finance
49 / 96
Case Study: Analyzing Stock Quotes with pandas
and store it in a
pandas DataFrame
object
Series
data analysis:
calculate the daily log returns (use the
shift
method of the
object) and generate a new column with the log returns in the
pandas DataFrame object;
calculate 252 day rolling means and standard deviations of the log returns as well as their rolling correlations and generate respective columns
plotting:
plot the log returns together with the daily DAX quotes into a single gure;
plot in another gure the rolling means and the standard deviations of the log returns as well as their correlation
data storage:
HDFStore
G
save the
pandas DataFrame
to a
PyTables/HDF5
database (use the
function) read historical quotes of the DAX index (ticker
data gathering:
DAXI )beginningwith01January 2000f romfinance.yahoo.com
Python for Finance
50 / 96
1. Data Gathering
# # Analysis of Historical Stock Data # with pandas # RFE_Data . py # # (c) Visixion GmbH # Script for Illustration Purposes Only . # from pylab import * # 1. Data Gathering from pandas . io . data import * DAX = DataReader ( ' ^ GDAXI ' , ' yahoo ' , start = ' 01 / 01 / 2000 ' )
Python for Finance
51 / 96
2. Data Analysis (I)
# 2. Data Analysis from pandas import * DAX [ ' Ret ' ]= log ( DAX [ ' Close ' ]/ DAX [ ' Close ' ]. shift (1 )) DAX [ ' rMe ' ]= rolling_mean ( DAX [ ' Ret ' ], 252 )* 252 DAX [ ' rSD ' ]= rolling_std ( DAX [ ' Ret ' ], 252 )* sqrt ( 252 ) DAX [ ' Cor ' ]= rolling_corr ( DAX [ ' rMe ' ], DAX [ ' rSD ' ], 252 )
Python for Finance
52 / 96
2. Data Analysis (II)2
print DAX.ix[-20:].to_string() Open High Low Date 2012-05-31 6297.68 6322.69 6208.09 2012-06-01 6259.76 6259.76 6008.47 2012-06-04 5976.46 6030.81 5942.38 2012-06-05 5999.86 6011.56 5914.43 2012-06-06 6028.36 6102.42 5996.41 2012-06-07 6117.76 6230.22 6099.08 2012-06-08 6082.63 6144.76 6053.95 2012-06-11 6255.65 6287.54 6130.28 2012-06-12 6141.92 6211.14 6083.81 2012-06-13 6183.80 6221.36 6093.61 2012-06-14 6146.92 6167.49 6078.22 2012-06-15 6164.56 6251.59 6158.78 2012-06-18 6304.77 6316.14 6221.87 2012-06-19 6254.77 6375.27 6233.25 2012-06-20 6364.06 6402.21 6333.97 2012-06-21 6357.25 6427.49 6331.79 2012-06-22 6273.10 6318.06 6256.34 2012-06-25 6229.43 6229.43 6118.72 2012-06-26 6157.84 6165.28 6109.93 2012-06-27 6155.91 6230.51 6131.30
Close 6264.38 6050.29 5978.23 5969.40 6093.99 6144.22 6130.82 6141.05 6161.24 6152.49 6138.61 6229.41 6248.20 6363.36 6392.13 6343.13 6263.25 6132.39 6136.69 6228.99
Volume Adj Close 33014600 42856100 23699300 22355900 32200300 28859800 22742300 29749700 28227200 28021500 29461700 70434200 28946700 25250900 22461300 30737700 25903100 25886800 25550800 25213500 6264.38 6050.29 5978.23 5969.40 6093.99 6144.22 6130.82 6141.05 6161.24 6152.49 6138.61 6229.41 6248.20 6363.36 6392.13 6343.13 6263.25 6132.39 6136.69 6228.99
Ret -0.002618 -0.034773 -0.011982 -0.001478 0.020657 0.008209 -0.002183 0.001667 0.003282 -0.001421 -0.002259 0.014683 0.003012 0.018263 0.004511 -0.007695 -0.012673 -0.021115 0.000701 0.014929
rMe -0.125673 -0.154371 -0.180338 -0.169200 -0.150697 -0.159234 -0.148888 -0.146535 -0.150797 -0.150285 -0.171289 -0.155601 -0.134741 -0.112545 -0.106139 -0.122593 -0.152372 -0.184679 -0.189818 -0.178054
rSD 0.302519 0.304405 0.304262 0.304029 0.304764 0.304395 0.304165 0.304173 0.304089 0.304087 0.303470 0.303859 0.303387 0.303949 0.303985 0.303932 0.303660 0.304118 0.304050 0.304430
Cor -0.698396 -0.695765 -0.692978 -0.690525 -0.687612 -0.684112 -0.680271 -0.676090 -0.671883 -0.666613 -0.661152 -0.654909 -0.648260 -0.640783 -0.633014 -0.625235 -0.617441 -0.609470 -0.600965 -0.592066
Quelle: http://finance.yahoo.com, 24. June 2012

Python for Finance
53 / 96
3. Plotting (I)
# 3. Plotting figure () subplot ( 211 ) DAX [ ' Close ' ]. plot () ylabel ( ' Index Level ' ) subplot ( 212 ) DAX [ ' Ret ' ]. plot () ylabel ( ' Log Returns ' ) DAX [[ ' rMe ' , ' rSD ' , ' Cor ' ]]. plot ()
Python for Finance
54 / 96
3. Plotting (II)3
Index Level
3
9000 8000 7000 6000 5000 4000 3000 2000 001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 0.15 2 0.10 0.05 0.00 0.05 0.10 2 2 1 1 5 7 3 9 8 0 6 4 200 200 200 200 200 200 200 200 200 201 201 201

Log Returns
Python for Finance
55 / 96
3. Plotting (III)4
1.0 0.5
Cor rMe rSD
0.0
0.5
1.0 2 2 1 1 5 7 3 9 8 0 6 4 200 200 200 200 200 200 200 200 200 201 201 201

Python for Finance
56 / 96
4. Data Storage (in HDF5 format)
# 4. Data Storage h5file = HDFStore ( ' DAX . h5 ' ) h5file [ ' DAX ' ]= DAX h5file . close ()
Python for Finance
57 / 96
The whole Python script
# # Analysis of Historical Stock Data with pandas # RFE_Data . py # # (c) Visixion GmbH # Script for Illustration Purposes Only . # from pylab import * # 1. Data Gathering from pandas . io . data import * DAX = DataReader ( ' ^ GDAXI ' , ' yahoo ' , start = ' 01 / 01 / 2000 ' ) # 2. Data Analysis from pandas import * DAX [ ' Ret ' ]= log ( DAX [ ' Close ' ]/ DAX [ ' Close ' ]. shift ( 1 )) DAX [ ' rMe ' ]= rolling_mean ( DAX [ ' Ret ' ], 252 )* 252 DAX [ ' rSD ' ]= rolling_std ( DAX [ ' Ret ' ], 252 )* sqrt ( 252 ) DAX [ ' Cor ' ]= rolling_corr ( DAX [ ' rMe ' ], DAX [ ' rSD ' ], 252 ) # 3. Plotting figure () subplot ( 211 ); DAX [ ' Close ' ]. plot (); ylabel ( ' Stock Price ' ) subplot ( 212 ); DAX [ ' Ret ' ]. plot (); ylabel ( ' Log Returns ' ) DAX [[ ' rMe ' , ' rSD ' , ' Cor ' ]]. plot () # 4. Data Storage h5file = HDFStore ( ' DAX . h5 ' ) h5file [ ' DAX ' ]= DAX h5file . close ()
Python for Finance
58 / 96
PyTables
A Pythonic database5
Fast Data Mining with PyTables
PyTables is a package for managing hierarchical datasets and designed to eciently and easily cope with extremely large amounts of data. PyTables is built on top of the HDF5 library, using the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code ..., makes it a fast, yet extremely easy to use tool for interactively browse, process and search very large amounts of data. One important feature of PyTables is that it optimizes memory and disk resources so that data takes much less space (specially if on-ight compression is used) than other solutions such as relational or object oriented databases. One characteristic that sets PyTables apart from similar tools is its capability to perform extremely fast queries on your tables in order to facilitate as much as possible your main goal: get important information *out* of your datasets.
All quotes from www.pytables.org

Python for Finance
59 / 96
Some of the most important PyTables functions/methods
openFile: create new le or open existing le, like in h5=openFile('data.h5','w'); 'r'=read only, 'a'=read/write .close(): close database, like in h5.close() h5.createGroup: create a new group, as in group=h5.createGroup(root,'Name') IsDescription: class for column descriptions of tables, used as in: class Row(IsDescription): name = StringCol(20,pos=1) data = FloatCol(pos=2) h5.createTable: create new table, as in tab=h5.createTable(group,'Name',Row) tab.iterrows(): iterate over table rows tab.where('condition'): SQL-like queries with exible conditions tab.row: return current/last row of table, used as in r=tab.row row.append(): append row to table, as in r.append() tab.flush(): ush table buer to disk/le h5.createArray: create an array, as in arr=h5.createArray(group,'Name',zeros((10,5))
Python for Finance
60 / 96
Let's start with a simple example (I)

In [59]: from tables import * In [60]: h5=openFile('Test_Data.h5','w') In [61]: class Row(IsDescription): ....: number = FloatCol(pos=1) ....: sqrt = FloatCol(pos=2) ....:
Introductory PyTables Example
In [62]: tab=h5.createTable(h5.root,'Numbers',Row) In [63]: tab Out[63]: /Numbers (Table(0,)) '' description := { "number": Float64Col(shape=(), dflt=0.0, pos=0), "sqrt": Float64Col(shape=(), dflt=0.0, pos=1)} byteorder := 'little' chunkshape := (512,) In [64]: r=tab.row In [65]: for x in range(1000): ....: r['number']=x ....: r['sqrt']=sqrt(x) ....: r.append() ....:
Python for Finance
61 / 96
Let's start with a simple example (II)
In [66]: tab Out[66]: /Numbers (Table(0,)) '' description := { "number": Float64Col(shape=(), dflt=0.0, pos=0), "sqrt": Float64Col(shape=(), dflt=0.0, pos=1)} byteorder := 'little' chunkshape := (512,) In [67]: tab.flush() In [68]: tab Out[68]: /Numbers (Table(1000,)) '' description := { "number": Float64Col(shape=(), dflt=0.0, pos=0), "sqrt": Float64Col(shape=(), dflt=0.0, pos=1)} byteorder := 'little' chunkshape := (512,) In [69]: tab[:5] Out[69]: array([(0.0, 0.0), (1.0, 1.0), (2.0, 1.4142135623730951), (3.0, 1.7320508075688772), (4.0, 2.0)], dtype=[('number', '<f8'), ('sqrt', '<f8')]) In [70]:
Python for Finance
62 / 96
Let's start with a simple example (III)

In [7]: h5=openFile('Test_Data.h5','a')
In [8]: h5 Out[8]: File(filename=Test_Data.h5, title='', mode='a', rootUEP='/', filters=Filters(complevel=0, shuffle=False, fletcher32=False)) / (RootGroup) '' /Numbers (Table(1000,)) '' description := { "number": Float64Col(shape=(), dflt=0.0, pos=0), "sqrt": Float64Col(shape=(), dflt=0.0, pos=1)} byteorder := 'little' chunkshape := (512,) In [9]: tab=h5.root.Numbers In [10]: tab[:5]['sqrt'] Out[10]: array([ 0. , 1. , 1.41421356, 1.73205081, 2. ])
In [11]: from pylab import * In [12]: plot(tab[:]['sqrt']) Out[12]: [<matplotlib.lines.Line2D at 0x7fe65cf12d10>] In [13]: show()
Python for Finance
63 / 96
We now want to implement another PyTables database

1 2
Exercise: PyTables with pandas
create a le: rst create a PyTables database le row description: by sub-classing from IsDescription
Row
containing a
name,
number
and the
3 4
create group: create a group with name create table: in the group Tables create
sub-class
square Tables
dene a table row with name
of that number
Row
a table with name
Numbers
using the
populate table:
square
by iteration generate 100,000 rows in the table
in the i-th row should be
i-th number,
the
number
Numbers;
the name
is a random number and the
is the square of the random number determine the mean, meadian and standard deviation of both the
data analysis:
number column and the square colum; regress the square column against the number column using polyfit and polyval visualization: generate a histogram of the square column and plot the cumulative sum of the number column array: create a group named Arrays, create an array of size (1000,1000) in it,
polulate the array with random numbers, double each random number, save the array and close the le
Python for Finance
64 / 96
And want to work with the data stored in that database

open the le: rst open the previous PyTables database le open the table: open the table with name Numbers write a function: write a function that takes as input the starting
ending row and print each row in between in the form
4 5 6
Exercise: PyTables with pandas
1 2 3
transfer data: transfer the data stored in the table plot the data: plot the data in the DataFrame object close the le: after completion, close the le
name | number | square into a pandas DataFrame object

in separate subplots
row and the
Python for Finance
65 / 96
Background of this case study

1
Case Study: Simulation Research Project
Python
such as
lends itself pretty well to implement numerical research projects in elds
physics engineering nance ... storing results automatically to a PyTables database or automatically generating Latex table output
Frequently, the results of such a project are to be documented and published in the form of a Latex document This case study is about potential approaches to automize numerical research with Python, e.g. by
The example project is taken from nance: we compare valuation results for European call options from Monte Carlo simulations with their analytical values
Python for Finance
66 / 96
Model EconomyBlack-Scholes-Merton continuous time

we consider a model economy with nal date
The Financial Model
T, 0 < T < {, F , F, P }
uncertainty is represented by a ltered probability space for
0tT
the risk-neutral index dynamics are given by the SDE
dSt = rdt + dZt St St

index level at date
(2)
with
0 < S0 < , r F,
i.e.
constant risk-less short rate,
constant
volatility of the index and the process
Zt
standard Brownian motion
generates the ltration
Ft F (S0st )
a risk-less zero-coupon bond satises the DE
dBt = rdt Bt
the time
(3)
t 0t<T
value of a zero-coupon bond paying one unit of currency at r (T t) is Bt (T ) = e
with
Python for Finance
67 / 96
Model EconomyBlack-Scholes-Merton discrete time

to simulate the nancial model, i.e. to generate numerical values for (2) has to be discretized to this end, divide the given time interval that now
The Financial Model
St ,
the SDE
t {0, t, 2t, ..., T },
i.e. there are
[0, T ] in equidistant sub-intervals t such M + 1 points in time with M T /t
a discrete version of the continuous time market model (2)(3) is
St Stt Bt Btt
for
= =
r 2
t+
tzt
(4)
ert zt
(5)
t {t, ..., T }
and standard normally distributed
this scheme is an Euler discretization which is known to be exact for the geometric Brownian motion (2)
Python for Finance
68 / 96
European Optionsrisk-neutral valuation

attainable/redundant and
The Financial Model
by the Fundamental Theorem of Asset Pricing, the time
value of an
FT measurable
contingent claim
VT hT (ST ) 0
(satisfying suitable integrability conditions) is given by arbitrage as
Vt = EQ t (Bt (T )VT )
with
V0 = EQ 0 (B0 (T )VT )
as the important special case for valuation purposes

6
denotes the expectation operator
and
the unique risk-neutral probability
measure equivalent to the real world measure
P Q
follows
the Black-Scholes-Merton (BSM) model (2)(3) is known to be complete from which uniqueness of the risk-neutral measure the dening characteristic of martingale in the BSM model, the present value of a call option is given by
is that it makes the discounted index level process a
V0 = erT EQ 0 (max[ST K, 0])
6 E () t
is short for the conditional expectation E(|F )

t
Python for Finance
69 / 96
European OptionsMonte Carlo valuation

assume risk-neutrality holds simulate
The Financial Model
I index level paths with M + 1 St,i , t {0, ..., T }, i {1, ..., I } t=T t=0
the option value is calculate the MCS estimator
points in time leading to index level values
for for
VT,i = hT (ST,i )
by arbitrage
0M CS = erT 1 V I
VT,i
i=1
(6)
Python for Finance
70 / 96
European OptionsAnalytical value

The time
The Financial Model
analytical value of a European call option in the BSM model is
C (S, t = 0) = S0 N(d1 ) KerT N(d2 )

with
(7)
N(d) d1 d2
= = =
1 2 log log
S0 K
e 2 x dx
2 )T 2 2 )T 2
+ (r + T + (r T
S0 K
Python for Finance
71 / 96
A Monte Carlo simulation study

on a stock index
The Financial Model
We set up a Monte Carlo simulation study for the valuation of European call options
We want to evaluate the impact of dierent simulation congurations on the accuracy of the MCS estimator (6) As benchmark we have available the analytical option value from formula (7) As model parameters we chose:
S0 = 100, = 0.25, r = 0.05, K {90, 100, 110}, T {1/12, 1/2, 1}

for a total of 9 option values As variance reduction techniques, we introduce moment matching and antithetic variates We assume for the number of time intervals
M {25, 50}
and the number of paths
I {25000, 50000}
All in all, we get 16 dierent congurations for the simulation set-up We say that a valuation is accurate if the valuation error is smaller than 1 percent or smaller than 1 cent
Python for Finance
72 / 96
Monte Carlo SimulationSimulation parameters
Python and PyTables Implementation
# General Simulation Parameters write = True cL =[( False , False ) ,( False , True ) ,( True , False ) ,( True , True )] # 1st = moMatch -- Random Number Correction ( std + mean + drift ) # 2nd = antiPaths -- Antithetic Paths for Variance Reduction mL =[ 25 , 50 ] # Time Steps iL =[ 25000 , 50000 ] # Number of Paths per Valuation SEED = 100000 # Seed Value R= 10 # Number of Simulation Runs PY1 = 0. 010 # Performance Yardstick 1 : Abs . Error in Currency Units PY2 = 0. 010 # Performance Yardstick 2 : Rel . Error in Decimals tL =[ 1 .0/ 12 , 1.0/2 ,1 .0] # Maturity List kL =[ 90 , 100 , 110 ] # Strike List
for c in cL : # Variance Reduction Techniques moMatch , antiPaths = c for M in mL : # Number of Time Steps for I in iL : # Number of Paths ... # Name of the Simulation Setup name =( ' Call_ ' + str ( R )+ ' _ ' + str (M )+ ' _ ' + str (I / 1000 )+ ' _ ' + str ( moMatch )[ 0 ]+ str ( antiPaths )[ 0 ]+ ' _ ' + str ( PY1 * 100 )+ ' _ ' + str ( PY2 * 100 )) seed ( SEED ) # RNG seed value for i in range (R ): # Simulation Runs ... for T in tL : # Times -to - Maturity ... for K in kL : # Strikes
Python for Finance
73 / 96
Monte Carlo SimulationVariance reduction

Matching of 1. and 2. moment of random numbers:
# Function for Random Numbers def RNG (M ,I ): if antiPaths == True : randh = standard_normal (( M +1 , I/2 )) rand = concatenate (( randh ,- randh ), 1) else : rand = standard_normal (( M+1 ,I )) if moMatch == True : rand = rand / std ( rand ) rand = rand - mean ( rand ) return rand
Matching of 1. moment for index level dynamics:
# Function for BSM Index Process def eulerSLog ( S0 , vol , r ): ran = RNG (M ,I ) sdt = sqrt ( dt ) S= zeros (( M+1 ,I), ' d ' ) S[0 ,:]= log ( S0 ) for t in range (1 , M+1 ,1 ): S[t ,:]+= S [t -1 ,:] S[t ,:]+=( r - vol ** 2/2 )* dt S[t ,:]+= vol * ran [t ]* sdt if moMatch == True : S[t ,:] -= mean ( vol * ran [t ]* sdt ) return exp (S )
Python for Finance
74 / 96
Monte Carlo SimulationEvaluating accuracy
for c in cL : # Variance Reduction Techniques moMatch , antiPaths = c for M in mL : # Number of Time Steps for I in iL : # Number of Paths ... seed ( SEED ) # RNG seed value for i in range (R ): # Simulation Runs ... for T in tL : # Times -to - Maturity ... for K in kL : # Strikes h= maximum (S[ -1] -K ,0 ) # Inner Value Vector ## MCS Estimator V0_MCS = exp (- r*T )* sum (h )/ I ## BSM Analytical Value V0 = BSM_Call (S0 ,K ,T ,r , vol , 0) ## Errors diff = V0_MCS - V0 rdiff = diff / V0 absError . append ( diff ) relError . append ( rdiff * 100 ) ... if abs ( diff )< PY1 or abs ( diff )/ V0 < PY2 : print " Accuracy ok !\ n"+ br CORR = True else : print " Accuracy NOT ok !\ n" + br CORR = False ; errors = errors +1
Python for Finance
75 / 96
PyTables
Table records by sub-classing
# # Creating a Database for Simulation Results # (c) Visixion GmbH - Y. Hilpisch # Script for illustration purposes only # from tables import * from numpy import * # Record to store set of simulation results class SimResult ( IsDescription ): id_number = Int32Col ( pos =1 ) sim_name = StringCol ( 32 , pos =2) seed = Int32Col ( pos =3 ) runs = Int32Col ( pos =4 ) time_steps = Int32Col ( pos =5 ) paths = Int32Col ( pos = 6) ... # Record to store single simulation result class ValResult ( IsDescription ): id_number = Int32Col ( pos = 1) sim_name = StringCol (32 , pos =2) opt_T = Float32Col ( pos = 3) opt_K = Float32Col ( pos = 4) euro_ana = Float32Col ( pos = 5) euro_mcs = Float32Col ( pos = 6) correct = StringCol (8 , pos =7 ) val_err_abs = Float32Col ( pos =8 ) ...
Python for Finance
76 / 96
PyTables
Generating hierarchical database
# Generate new hdf5 file for results storage filename = " MCS_Results_Comp . h5 " def CreateFile ( filename ): h5file = openFile ( filename , mode = "w" , title = " BSM_MCS_Results " ) ## Open / Generate hdf5 file in " write " mode group = h5file . createGroup ( "/" , ' results ' , ' Results ' ) ## Create a group called " Results " h5file . createTable ( group , ' Sim_Results ' , SimResult , " Simulation Results " ) ## In the group " Results ": ## Create a table called " Simulation Results " with Record " SimResult " h5file . createTable ( group , ' Val_Results ' , ValResult , " Valuation Results ") ## Create a table called " Valuation Results " with Record " ValResult " h5file . close ()
Python for Finance
77 / 96
PyTables
Write results to table
# Fill the table with simulation results def ResWrite ( name , SEED ,R ,M ,I , moMatch , antiPaths ,l , atol , rtol , errors , absError , relError ,t1 , t2 , d1 , d2 ): h5file = openFile ( filename , mode = "a" ) table = h5file . root . results . Sim_Results simres = table . row idn = 1 if len ( table )> 0: for x in table . iterrows (): idn = max ( idn ,x[ ' id_number ' ]); idn = idn + 1 simres [ ' id_number ' ] = idn simres [ ' sim_name ' ] = name simres [ ' seed ' ] = SEED simres [ ' runs ' ] = R simres [ ' time_steps ' ] = M simres [ ' paths ' ] = I simres [ ' mo_match ' ] = moMatch simres [ ' anti_paths ' ] = antiPaths simres [ ' opt_prices ' ] = l simres [ ' abs_tol ' ] = atol simres [ ' rel_tol ' ] = rtol simres [ ' errors ' ] = errors simres [ ' error_ratio ' ]= float ( errors )/ l simres [ ' av_val_err ' ] = sum ( array ( absError ))/ l ... simres [ ' time_opt ' ] = (t2 - t1 )/ l simres [ ' start_date ' ] = str ( d1 ) simres [ ' end_date ' ] = str ( d2 ) simres . append () table . flush () h5file . close ()
Python for Finance
78 / 96
PyTables
Writing results to database during script execution
... from MCS_Results_PyTables import * ... write = True ... for c in cL : # Variance Reduction Techniques moMatch , antiPaths = c for M in mL : # Number of Time Steps for I in iL : # Number of Paths if write == True : h5file = openFile ( filename , mode = ' a ' ) ... if write == True : ValWrite ( h5file , name ,T ,K ,V0 , V0_MCS , str ( CORR ) , M ,I , str ( moMatch ), str ( antiPaths ), datetime . now ()) ... if write == True : h5file . close () ... if write == True : ResWrite ( name , SEED ,R ,M ,I , str ( moMatch ) , str ( antiPaths ),l , PY1 , PY2 , errors , absError , relError ,t1 ,t2 ,d1 , d2 )
Python for Finance
79 / 96
PyTables
Print results in Latex table form
# Print simulation results ( Latex table output ) def PrintTex ( filename = filename , idl =0 , idh = 50 ): h5file = openFile ( filename , mode ="r ") table = h5file . root . results . Sim_Results for simres in table . where ( ''' idl <= id_number <= idh ''' ): print ( str ( simres [ ' runs ' ])+ ' & ' + str ( simres [ ' time_steps ' ])+ ' & ' + str ( simres [ ' paths ' ])+ ' & ' + str ( simres [ ' mo_match ' ])+ ' & ' + str ( simres [ ' anti_paths ' ])+ ' & ' + ' %. 3f ' % simres [ ' abs_tol ' ]+ ' & ' + ' %. 3f ' % simres [ ' rel_tol ' ]+ ' & ' + str ( simres [ ' opt_prices ' ])+ ' & ' + str ( simres [ ' errors ' ])+ ' & ' + ' %. 3f ' % simres [ ' av_val_err ' ]+ ' & ' + ' %. 3f ' % simres [ ' time_opt ' ]+ " \\ tn ") h5file . close ()
Python for Finance
80 / 96
Putting it all togetherPress F5 and type PrintTex()

>>> PrintTex() 10 & 25 & 25000 10 & 25 & 50000 10 & 50 & 25000 10 & 50 & 50000 10 & 25 & 25000 10 & 25 & 50000 10 & 50 & 25000 10 & 50 & 50000 10 & 25 & 25000 10 & 25 & 50000 10 & 50 & 25000 10 & 50 & 50000 10 & 25 & 25000 10 & 25 & 50000 10 & 50 & 25000 10 & 50 & 50000 & & & & & & & & & & & & & & & &
False & False & 0.010 & 0.010 & 90 & 22 & 0.013 & 0.059 \tn False & False & 0.010 & 0.010 & 90 & 9 & -0.010 & 0.093 \tn False & False & 0.010 & 0.010 & 90 & 8 & -0.005 & 0.088 \tn False & False & 0.010 & 0.010 & 90 & 6 & -0.017 & 0.152 \tn False & True & 0.010 & 0.010 & 90 & 10 & 0.002 & 0.051 \tn False & True & 0.010 & 0.010 & 90 & 5 & 0.008 & 0.081 \tn False & True & 0.010 & 0.010 & 90 & 12 & -0.005 & 0.078 \tn False & True & 0.010 & 0.010 & 90 & 1 & -0.008 & 0.143 \tn True & False & 0.010 & 0.010 & 90 & 5 & 0.010 & 0.066 \tn True & False & 0.010 & 0.010 & 90 & 2 & -0.006 & 0.108 \tn True & False & 0.010 & 0.010 & 90 & 3 & -0.008 & 0.104 \tn True & False & 0.010 & 0.010 & 90 & 5 & -0.007 & 0.188 \tn True & True & 0.010 & 0.010 & 90 & 13 & 0.001 & 0.061 \tn True & True & 0.010 & 0.010 & 90 & 3 & 0.009 & 0.101 \tn True & True & 0.010 & 0.010 & 90 & 11 & -0.004 & 0.100 \tn True & True & 0.010 & 0.010 & 90 & 1 & -0.006 & 0.197 \tn
Python for Finance
81 / 96
ResultsCopy & paste it to your Latex document
\newcommand{\tn}{\tabularnewline} \def\TMP{\begin{center}\begin{tabular}{c c c c c r r r r r r} \hline\hline $R$ & $M$ & $I$ & \textbf{MM} & \textbf{AP}& \textbf{ATol}& \textbf{RTol}& \textbf{\#Op} & \textbf{Err} & \textbf{AvEr} & \textbf{Sec/O.} \tn [0.5ex] % inserts table heading \hline % inserts single horizontal line 10 & 25 & 25000 & False & False & 0.010 & 0.010 & 90 & 22 & 0.013 & 0.059 \tn 10 & 25 & 50000 & False & False & 0.010 & 0.010 & 90 & 9 & -0.010 & 0.093 \tn 10 & 50 & 25000 & False & False & 0.010 & 0.010 & 90 & 8 & -0.005 & 0.088 \tn 10 & 50 & 50000 & False & False & 0.010 & 0.010 & 90 & 6 & -0.017 & 0.152 \tn 10 & 25 & 25000 & False & True & 0.010 & 0.010 & 90 & 10 & 0.002 & 0.051 \tn 10 & 25 & 50000 & False & True & 0.010 & 0.010 & 90 & 5 & 0.008 & 0.081 \tn 10 & 50 & 25000 & False & True & 0.010 & 0.010 & 90 & 12 & -0.005 & 0.078 \tn 10 & 50 & 50000 & False & True & 0.010 & 0.010 & 90 & 1 & -0.008 & 0.143 \tn 10 & 25 & 25000 & True & False & 0.010 & 0.010 & 90 & 5 & 0.010 & 0.066 \tn 10 & 25 & 50000 & True & False & 0.010 & 0.010 & 90 & 2 & -0.006 & 0.108 \tn 10 & 50 & 25000 & True & False & 0.010 & 0.010 & 90 & 3 & -0.008 & 0.104 \tn 10 & 50 & 50000 & True & False & 0.010 & 0.010 & 90 & 5 & -0.007 & 0.188 \tn 10 & 25 & 25000 & True & True & 0.010 & 0.010 & 90 & 13 & 0.001 & 0.061 \tn 10 & 25 & 50000 & True & True & 0.010 & 0.010 & 90 & 3 & 0.009 & 0.101 \tn 10 & 50 & 25000 & True & True & 0.010 & 0.010 & 90 & 11 & -0.004 & 0.100 \tn 10 & 50 & 50000 & True & True & 0.010 & 0.010 & 90 & 1 & -0.006 & 0.197 \tn \hline \hline %inserts double line \end{tabular}\end{center}} \newdimen\TMPsize\settowidth{\TMPsize}{\TMP} \begin{table}[h]\begin{center}\begin{minipage}{\TMPsize} \footnotesize{\caption[Results]{\label{tab:RESULTS_1}Simulation results ...}}} \vspace{0.3ex} % title of Table \TMP \end{minipage}\end{center}\end{table} $R$ = number of runs, $M$ = number of time intervals, ...
Python for Finance
82 / 96
Resultsthe nal product: Latex table

1 2
Table: Simulation results for dierent congurations of the MCS algorithm and an accuracy level of P Y = 0.01 and P Y = 0.01.
R M I MM AP ATol RTol #Op Err AvEr Sec/O. 10 25 25000 False False 0.010 0.010 90 22 0.013 0.059 10 25 50000 False False 0.010 0.010 90 9 -0.010 0.093 10 50 25000 False False 0.010 0.010 90 8 -0.005 0.088 10 50 50000 False False 0.010 0.010 90 6 -0.017 0.152 10 25 25000 False True 0.010 0.010 90 10 0.002 0.051 10 25 50000 False True 0.010 0.010 90 5 0.008 0.081 10 50 25000 False True 0.010 0.010 90 12 -0.005 0.078 10 50 50000 False True 0.010 0.010 90 1 -0.008 0.143 10 25 25000 True False 0.010 0.010 90 5 0.010 0.066 10 25 50000 True False 0.010 0.010 90 2 -0.006 0.108 10 50 25000 True False 0.010 0.010 90 3 -0.008 0.104 10 50 50000 True False 0.010 0.010 90 5 -0.007 0.188 10 25 25000 True True 0.010 0.010 90 13 0.001 0.061 10 25 50000 True True 0.010 0.010 90 3 0.009 0.101 10 50 25000 True True 0.010 0.010 90 11 -0.004 0.100 10 50 50000 True True 0.010 0.010 90 1 -0.006 0.197 R = number of runs, M = number of time intervals, I = number of simulation paths, CV = control variates, MM = moment matching, AP = antithetic paths, ATol = absolute performance yardstick, RTol = relative performance yardstick, #Op = number of options, Err = number of errors, AvEr = average error in currency units, Sec/O. = seconds per option valuation.
Python for Finance
83 / 96
Summary of the case study's insights

1 2 3
Python
is powerful when it comes to setting up numerical research projects
frequently, results of such projects are to be reported in the form of Latex documents using an example from mathematical nance, we show how to automate report generation through some simple methods:
iterating over several lists writing simulation results to a PyTables database reading results from database and printing strings in Latex format
once you have set up the numerical study, you only have three steps to generate a Latex table with the results
start your script by pressing F5 type PrintTex() copy the output to your Latex document
Python for Finance
84 / 96
Cython
: introducing static typing for speed-ups
Speeding-up Code with Cython
Fundamentals about Cython
As any other programming language with dynamic typing,
Python
suers from a
lack of speed in comparison with static typing languages as C or C++. Fortunately, the primary
Python
execution environment is written in C, so it is
possible to access external modules written in C. But it is not trivial to write the necessary C glue code. This problem is solved by
Python
CythonCython Cython
is a programming language based on
with extra syntax allowing for optional static type declarations.
The source code gets translated into optimized C/C++ code and compiled as
Python
extension modules, so
combines the best best of two worlds: the
very fast program execution of C and the high-level, object-oriented and fast programming of
Python.
Python for Finance
85 / 96
How to work with Cython
Fundamentals about Cython
Write your
Python
and/or
Translate the code by easiest way).
Cython code and using Distutils (the Python
save it with le extension most exible way) or
.pyx. pyximport
(the
Import the module just like a
module.
Python for Finance
86 / 96
Cython
Examples (I)
Example Code for Cython Use
Consider the integral
2 f (x)dx with f (x) x2 x. To approximate its value by its 0 lower sum we can use the following Python functions, saved in a module called a_Integrate_Norm.py, say.
# # Integration with Pure Python # a_Integrate_Norm . py # import time def f(x ): return x **2 - x def integrate (a ,b , N ): t0 = time . time () s= 0 dx =( b -a )/ N for i in range (N ): s += f(a +i* dx ) t1 = time . time () print " Approximate value is %s " %( s* dx ) print " Computation made in : %. 2f seconds " %( t1 - t0 )
Python for Finance
87 / 96
Cython
Examples (II)
Now, we apply this code in the shell.
In [76]: from a_Integrate_Norm import * In [77]: integrate(0,2.0,10000) Approximate value is 0.66646668 Computation made in : 0.01 seconds In [78]: integrate(0,2.0,100000) Approximate value is 0.6666466668 Computation made in : 0.05 seconds In [79]: integrate(0,2.0,1000000) Approximate value is 0.666664666668 Computation made in : 0.40 seconds In [80]: integrate(0,2.0,10000000) Approximate value is 0.666666466667 Computation made in : 3.83 seconds
Python for Finance
88 / 96
Cython
Examples (III)
Rename the module from
to the shell. To import and compile
a_Integrate_Norm.py to b_Integrate_Norm.pyx Cython modules just type
and switch
In [81]: import pyximport In [82]: pyximport.install()
After that, you can import and execute the functions as usual.
In [83]: from b_Integrate_Norm import * In [84]: integrate(0,2.0,1000) Approx. value is 0.664668 Computation made in : 0.00 seconds In [85]: integrate(0,2.0,10000000) Approx. value is 0.666666466667 Computation made in : 1.17 seconds In [86]:
Only the precompiling of the pure
Python
code delivers a speed-up of nearly 70%.
Python for Finance
89 / 96
Cython
Examples (IV)
Next, we apply static typing to the code.
# # Integration with Static Typing in Cython # c_Integrate_with_static_typing . pyx # import time def f( double x ): return x **2 - x def integrate ( double a , double b , int N ): t0 = time . time () cdef int i cdef double s , dx s= 0 dx =( b -a )/ N for i in range (N ): s += f(a +i* dx ) t1 = time . time () print " Approximate value is %s " %( s* dx ) print " Computation made in : %. 2f seconds " %( t1 - t0 )
Python for Finance
90 / 96
Cython
Examples (V)
In the shell, import and execute the code as above.
In [93]: from c_Integrate_with_static_typing import * In [94]: integrate(0,2.0,10000) Approximate value is 0.66646668 Computation made in : 0.00 seconds In [95]: integrate(0,2.0,10000000) Approximate value is 0.666666466667 Computation made in : 0.98 seconds In [96]:
The static typing results in a further 20% reduction in time.
Python for Finance
91 / 96
Cython
Examples (VI)
The return value of functions also has a type, so it could be a good idea to dene this type for the often called function f.
# # Integration with Static Typing in Cython # d_Integrate_with_static_typing_2 . pyx # import time cdef double f( double x ): return x **2 - x def integrate ( double a , double b , int N ): t0 = time . time () cdef int i cdef double s , dx s= 0 dx =( b -a )/ N for i in range (N ): s += f(a +i* dx ) t1 = time . time () print " Approximate value is %s " %( s* dx ) print " Computation made in : %. 2f seconds " %( t1 - t0 )
Python for Finance
92 / 96
Cython
Examples (VII)
Using this version, reduces calculation time to 0.05 secondsa speed-up factor of 75 times compared to pure
Python.
In [98]: import pyximport In [99]: pyximport.install() In [100]: from d_Integrate_with_static_typing_2 import * In [101]: integrate(0,2.0,10000000) Approximate value is 0.666666466667 Computation made in : 0.05 seconds In [102]:
Caution: Functions with a static type are no longer callable from
Python
contexts.
Python for Finance
93 / 96
Typical Cython developement cycle
Typically, one applies a development process described roughly as follows: write your
Python
code
x it and optimize it (e.g. with respect to
NumPy
vectorization)
prole your program to identify the most time consuming parts re-write these parts with
Cython
extensions to speed them up
Python for Finance
94 / 96
Conclusions
1 2 3 4 5 6 7 8
Conclusion
Python
is powerful and multi-purpose
For Finance, it oers numerous really helpful libraries There are a number of good development tools available It allows high productivity levelsfor lone warriors as well as for teams It is easy-to-maintaincompact, readable code It is compact and nevertheless quite fast (when done right) It is low/no cost and future-proof It is fun to work withbe it in Finance or any other area
Python for Finance
95 / 96
Contact
Dr. Yves J. Hilpisch Visixion GmbH Rathausstrasse 75-79 66333 Voelklingen Germany
www.visixion.com www.dxevo.com www.dexision.com
Conclusion
E contact@visixion.com T +49 6898 932350 F +49 6898 932352
Python for Finance
96 / 96

Y Hilpisch Python For Finance 06072012 EuroPython NoSo

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Y Hilpisch Python For Finance 06072012 EuroPython NoSo

Transféré par

Droits d'auteur :

Formats disponibles

Python for Finance

Dr. Yves J. Hilpisch

Training at EuroPython 2012 Conference

Finance, Derivatives Analytics & Python Programming

Y. Hilpisch (Visixion GmbH)

Python for Finance

EuroPython, Florence, 2012

Python for Finance

EuroPython, Florence, 2012

Guidelines for the Training

The training addresses a number of important

to get an impression of Python's advantages for Finance further self-study

Y. Hilpisch (Visixion GmbH)

Python for Finance

EuroPython, Florence, 2012

Throughout the training: Results matter more than Style

Y. Hilpisch (Visixion GmbH)

Python for Finance

EuroPython, Florence, 2012

Some of the most important Python libraries for Finance projects

Useful Python Libraries

Y. Hilpisch (Visixion GmbH)

Python for Finance

EuroPython, Florence, 2012

Some recommended Python resources

Useful Python Libraries

Y. Hilpisch (Visixion GmbH)

Python for Finance

EuroPython, Florence, 2012

Major benets of Python for Finance applications

Useful Python Libraries

multi-purposeyou can use it for everything (synergies)

prototyping and rapid application development are an original with

you can re-use rst code versions when migrating

to standard development cycle

open source, easy-to-learn, easy-to-code, easy-to-maintain, only a fraction

of the code of other languages needed, multi-platform

huge development eorts around the world, growing number of

successful scientic and corporate applications, growing number of

using the right libraries ensures execution speed comparable to

Y. Hilpisch (Visixion GmbH)

Python for Finance

EuroPython, Florence, 2012

Visixion's experience with Python

Useful Python Libraries

full-edged Derivatives Analytics suite implemented in

www.visixion.com) trainings: Python trainings

used to implement a number of numerical research projects (see

with focus on Finance for clients from the nancial

Y. Hilpisch (Visixion GmbH)

Python for Finance

EuroPython, Florence, 2012

Array operations with NumPy

Basic Array Operations

is a powerful library that allows array manipulations (linear algebra) in a compact

combined with the speed of C when doing array operations.

7., 8., 18., 19.])

Y. Hilpisch (Visixion GmbH)

EuroPython, Florence, 2012

Array operations with NumPy

Basic Array Operations

array operations are as easy as operations on integers or oats.

2. ], 4.5], 7. ], 9.5]]) 16.], 81.], 196.], 361.]])

Major benets of Python for Finance applications

multi-purposeyou can use it for everything (synergies)

you can re-use rst code versions when migrating

huge development eorts around the world, growing number of

successful scientic and corporate applications, growing number of

full-edged Derivatives Analytics suite implemented in

with focus on Finance for clients from the nancial

array operations are as easy as operations on integers or oats.

should be obvious. The rst can produce arrays of oat

The gure below shows the output. Notice that

dierent values eachwhich is due to the array size of

ve lines with four

Figure: Example of gure with matplotlibhere: lines

Figure: Example of gure with matplotlibhere: dots & bars

functionand store it in another array; call it

into a single gure (as