Vous êtes sur la page 1sur 18

Introduction to numerical processing in

Python using Spyder


Effective use of programming in scientific research
20 June 2011
School of Civil Engineering and Geosciences
Newcastle University
http://conferences.ncl.ac.uk/sciprog
http://www.vitae.ac.uk/programmingforscientists

Overview
This tutorial is part of the hands-on introductory session at the Effective use of programming in
scientific research workshop, Newcastle University, 20th June 2011.

Tutorial Aim:
Introduce the user to numerical processing and plotting in Python using the Spyder interface

Learning Objectives:
Gain a basic understanding of the Python interpreter
Read numerical data from a file on disk using Numpy
Develop custom functions
Create a Python script containing simple model to generated an simulated data-set
Use Scipy to compare observed and simulated results using a statistical method
Use the Matplotlib module in Spyder to produce data plots

Introduction to numerical processing in Python using Spyder

Contents
1 Introduction
2 Pre-requisites
3 The Spyder interface
4 Numerical Arrays
5 Plotting
6 Writing Scripts
7 Going further

Document Licensing and Copyright


Introduction to numerical processing in Python using Spyder is copyright Tomas Holderness
and licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
License.

Version 1.0.2 (16-06-2011)

Introduction to numerical processing in Python using Spyder

1 Introduction
Python is a powerful and efficient programming language which has a clear and readable syntax
that is easy to learn and fast to programme in. Python is platform independent, runs on all major
operating systems and is released under an open source license, meaning that it is freely use
able and distributable. Furthermore, Python is a full and complete object orientated language
which can be used to develop anything from simple scripts to the complete programmes with
graphical user interfaces. For more information on Python visit http://python.org/about/.
One of the key strengths of Python is the range of available libraries (called modules) available
to the user, covering a wide variety of application domains. Two such libraries are Scientific
Python (SciPy: http://www.scipy.org/) and Numerical Python (Numpy: http://numpy.scipy.org/)
which provide tools for mathematical, scientific and engineering computing. These modules
allow scientists and engineers to quickly create fast and efficient software for numerical processing, making it an ideal platform for many research scientists. For more examples of scientific computing in Python visit: http://www.python.org/about/success/#scientific. For examples
of scientific Python software visit: http://wiki.python.org/moin/NumericAndScientific.
This practical uses the Spyder (Scientific PYthon Development EnviRonment:
http://code.google.com/p/spyderlib) package, which provides a MATLAB-like environment for
the development of scientific software and numerical processing tools using, Numpy, SciPy
and the graphical plotting package Matplotlib (http://www.matplotlib.sourceforge.net).

2 Pre-requisites
Users in the practical session will be able to access the software and data on the cluster PCs
and guide themselves through the tutorial. Staff will be on-hand to answer any queries.
Users outside the practical session will require the following:
Python 2.5 or later (http://python.org)
Spyder 2.0.11 or later (http://code.google.com/p/spyderlib/)
Microsoft Windows users can install the Python(xy) package (http://www.pythonxy.com/)
which provides all the necessary software in one installation.
Sample data files (http://conferences.ncl.ac.uk/sciprog)

3 The Spyder Interface


3.1 Introduction
Spyder provides a development environment for Python specifically targeted at developers of
scientific and engineering software. Spyder provides a number of different windows for programme development.
1. Open Spyder (Start -> All Programs -> Python(x,y) -> Spyder -> Spyder)
2. The Spyder interface is split into three windows:
a) The Editor for creating Python scripts
b) The Console for command line programming
c) The Object inspector which gives information about objects in the Editor or Console
windows
Figure 1 shows a screenshot of the Spyder interface with the three windows highlighted.
Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

Figure 1: Screenshot showing the Spyder Interface

3.2 Simple Commands


This section of the practical will familiarise you with the Spyder Console interface and introduce
the Interactive Python (IPython) and Pylab environments. The Spyder Console provides a
Python command line interface. The console uses Interactive Python (IPython) and the Pylab
environment to provide a series of pre-loaded modules to make numerical data handling and
plotting easier.
1. Drag the left-hand edge of the Console towards the left to expand it and give yourself
more room to work with.
2. In the Console type the following command:

print " hello world "


This will display hello world in the Console. The print command is a simple Python
function and can be used to display text and numerical data in the command line. The
quotation marks tell Python to display the information as text (a string).
3. Python can evaluate calculations directly on the command line. Now In the Console type
the following command:

4 + 5
The result will be shown on the next line.
4. This time we will pass the result of our calculation to a variable. A variable is a space in
the computers memory to hold a piece of data.

a = 4 + 5
Nothing will be displayed as we have passed the output of our calculation to a. To see
the result, type:

print a

Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

5. Its worth noting that for convenience we can avoid typing print each time we wish to see
the contents of a variable and just type the variable name instead. For example:

a
6. So far weve been entering data on the command line and letting Python interpret our
data type, this can cause unexpected problems with numerical data. For example now
do:

b = 8 / 3
b
Note that the result is two. This is because Python is interpreting our numbers as integers
(whole numbers) and uses integer division for the calculation, which is unable to create a
fraction from our input. Next, do:

d = 5.0 / 4.0
d
This time we get the correct answer, because the decimal point tells Python were using
floating point numbers (numbers with decimal points or fractional values).
7. If you need to know an objects data type you can use the type command. For example
to show the types of variables b and c do:

type ( b), type (d )


8. As weve seen above Python supports dynamic variable allocation, so we can overwrite
an existing variable by passing it some new data. In this example we overwrite the value
of b to contain the value of :

b = pi
b
9. Numpy includes a number of mathematical functions that we can use in the Command
line, for example Cos, Sin and Tan. For example to find the hypotenuse length (c) of a
right-angled triangle using the length of the other two sides a and b and the angle between
them C we could use the following formula:
c2 = a2 + b2 2ab cos(C)
In Python this would look like:

c2 = pow (a ,2)+ pow (b ,2) -2* a *b* cos (C)


Note that we use the Numpy pow() function to raise the power of a number. The Numpy
cosine() function requires the input angle C to be in radians. So we may need to convert
our angle from degrees to radians, in Python this would look like:

C = radians (C )
Note that we have overwritten the value of C in degrees to its new equivalent value in
radians.
10. Now try implementing the complete process:

a ,b ,C = 10.1 ,6.2 ,102.5


c2 = pow (a ,2)+ pow (b ,2) -2* a *b* cos ( radians (C ))
c = sqrt ( c2 )
print (" %.2 f" % round (c ,2))
Here we use the print command to apply format our output. The %.2f tells print to
display a number with two decimal places and the round(c,2) function rounds the output
to two decimal places, before passing it to the print function.
Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

4 Numerical Arrays
4.1 Introduction
1. When dealing with numerical processing we often need to deal with arrays or matrices of
numbers. We can use the Numpy array type to enter numbers into an array. For example

myarray = array ([4.4 ,7.8 ,3.5])


Note that when you type the keyword array the Spyder Object inspector shows documentation about Numpy arrays (Figure 2).
Figure 2: Screenshot showing interaction between Console and Object inspector

2. We can view all the values of our array by calling the array name:

myarray
Note that through IPython, the Spyder Console supports auto-completion of existing variable names and objects. After typing the first few letters of myarray, press the tab button
on the keyboard to auto-complete the variable name.
3. Alternatively, if we need to view just the first element of the array we can do:

myarray [0]
Like many programming languages Python will count from 0, so the first value will be 0,
the second will be 1 and so on.
4. The Numpy module provides unary operations for performing numerical operations over
entire arrays. For example, we can sum all the values in the array:
Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

myarray . sum ()
5. Or we can perform arithmetic on each element in the array (an elementwise operation),
and pass the result to a new array:

secondarray = myarray / pi
secondarray
6. Numpy can also create arrays in more than one dimension, in this example well create
an array with two rows and three columns. The numbers in the first set of square brackets
are in the first row and the numbers in the second set of brackets are in the second row:

nextarray = array ([(3.5 ,4.4 ,7.8) ,(2.7 ,5.5 ,6.8)])


7. To view just one number from our array we have to specify the row and column index of
the number we want to view. To see the number in the first row, second column we would
do:

nextarray [0 ,1]
8. To view all the numbers in one row or column we use the special : symbol. To see all the
numbers in the second column we can do:

nextarray [: ,1]
9. Using this method we can select all the numbers from a row or column and pass them to
a new variable:

x = nextarray [0 ,:]
y = nextarray [1 ,:]
We will use these new variables to produce a plot in the following Plotting section.

4.2 Arrays from CSV files


Often we need to use data stored in files rather than entering it by hand. One common type
of text-based data format is a Comma Separated Value (CSV) file. Numpy can read CSV files
into arrays. In this practical were going to read in the file C:\Temp\temperature_profile.csv.
This file contains two columns of data (you can open the data in a text editor or spreadsheet
software if you wish), the first column represents altitude from sea level (meters), the second
contains observations of air temperature (kelvin) of the atmosphere at the given altitude.
1. To read this data in we can use the Numpy loadtxt function in the Console:

temperatures = loadtxt (" C :/ Temp / temperature_profile . csv " ,


delimiter =" ," , skiprows =1)
Here were telling the loadtxt function the location of the file to read in, how the values are
delimited (in this case by a comma) and to skip the first row as this contains the column
names.
2. We can view the data in the Console by doing:

temperatures
3. We can manipulate our array using the same techniques introduced above. Create a new
array with the temperatures (second column) converted from Kelvin to Celsius:

temperaturesC = temperatures
temperaturesC = temperaturesC [: ,1] - 273.15
Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

4. View the new array on screen by doing

temperaturesC
5. We can suppress printing of numbers in scientific format and view the array again by
doing:

set_printoptions ( suppress = True )


temperaturesC
Figure 3 shows an example of the new array in the Console.
Figure 3: Screenshot showing new array with temperatures in Celsius

6. Lastly, we can create a new CSV file from our processing using the Numpy savetxt function:

savetxt ("C :/ Temp / temperature_profile_celcius . csv " , temperaturesC


, delimiter = " ," , fmt =" %5.5 f")
You can view this file in a text editor or spreadsheet software if you wish.

Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

5 Plotting
5.1 Introduction
1. The Spyder console automatically loads the Pylab environment into the Console window,
this means that graphs can easily be generated in the command line using the Matplotlib
plot command:

plot (x ,y )
This shows a Matplotlib plot window, with our values plotted as a line. Using this window
we can visualise our plot (note the plot Figure 1 may be minimized to the task bar).
Figure 4 below shows an example of a plot window.
2. Close the plot window and plot again with the following command:

plot (x ,y ,"o --")


This tells Matplotlib to plot points and a joining dashed line.
3. Matplotlib contains function to create a variety of different plots, now try:

x = array ([1 ,2 ,3])


bar (x ,y , align = " center " , color =" red ")
This command creates a bar chart from the given data, the keywords align=center tell
the plot to use our x-values for the center of each bar.
4. Finally, plot the original temperatures data with temperature (seconds column) on the
x-axis and altitude (first column) on the y-axis.

plot ( temperatures [: ,1] , temperatures [: ,0] , "x " , color =" green " )
Figure 5 shows an example of the plotted temperature data. Many more example plots
can be found at http://matplotlib.sourceforge.net/.

Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

Figure 4: Screenshot showing Matplotlib window in Spyder

Figure 5: Screenshot showing Matplotlib plot of temperature data

Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

6 Writing Scripts
6.1 Introduction
Using the Console is good for short sequences of commands, however for bigger projects it is
useful to create a file containing a sequential list of commands, called a script which can then
be saved to disk. Python files are denoted with a .py extension. Spyder includes an Editor
window (Figure 1) for editing Python scripts.
1. In the Editor window type the following commands and the click Run from the Run menu
(Figure 6):
1
2
3

# This is a comment , for useful information .


# My first script , written by : < insert name here >
print " Hello World "
When you click Run in Spyder for the first time a dialog box will appear with details of how
to run the script, leave these as their defaults and click Run (Figure 7). In the Console
windows a new console will be opened temp.py, which will display the result of your
script (Figure 8).

2. Now were going to write a script to read and plot our temperature data. As were writing a
script and not using the Spyder console we first need to tell Python to import the modules
we require (e.g. Numpy and Matplotlib). When importing modules we can give them a
short name to make scripting easier.
1

# Temperature plotting script

2
3
4
5
6

# First import required modules


import sys , os
# Operating system modules
import numpy as np
# Numerical python
import matplotlib . pyplot as plt # Matplotlib plotting

7
8
9

# Read in data from file


temperatures = np . loadtxt ("C :/ temp / temperature_profile . csv " ,
delimiter =" ," , skiprows =1)

10
11
12
13

# Now plot temperature values


plt . plot ( temperatures [: ,1] , temperatures [: ,0] , "x" , color = " green ")
plt . show ()

14
15
16

# When finished , exit the script


exit (0)
Now click run again, to plot our temperature data. Note that here we have added lots
of additional comments using the # character, commenting code is a good idea as it
helps yourself and others understand exactly whats going on. For longer comments
over multiple lines use three quotation marks. For the purposes of this practical it is not
necessary to enter comments, other than for your own reference.

Version 1.0.1 (06-06-2011)

Introduction to numerical processing in Python using Spyder

Figure 6: Screenshot showing Run menu

Figure 7: Screenshot showing Run options

Version 1.0.1 (06-06-2011)

10

Introduction to numerical processing in Python using Spyder

Figure 8: Screenshot showing Hello World script

Version 1.0.1 (06-06-2011)

11

Introduction to numerical processing in Python using Spyder

6.2 Functions
The above scripts given an example of how to combine a sequence of commands to perform
a series of operations, however often we need to create our own functions to perform a specific and repeatable process. The temperature data in this practical is based on an observed
air temperature profile taken at 100 meter intervals through atmosphere. As we would expect,
higher altitudes mean lower temperatures. We can generate an idealised model of air temperature decrease with altitude based on the average environmental lapse rate (ELR) for air
temperature provided by the International Civil Aviation Organization (ICAO) of 6.49 K/1000m
between 0 and 11000m. That is for every 1000 meters gained in altitude the air temperature
drops by 6.49 K up to 11 kilometers.
1. In this section were going to create a function to calculate air temperature from ELR
based on a given temperature at sea level and an altitude. Functions are defined using
the def keyword. All code inside the function must have at least one tab-indentation from
the left margin. For example:
1

# Script to calculate air temperature from ELR

2
3
4

# First import required modules


import sys , os
# Operating system modules

5
6
7

# Now define our function . Variables in brackets are the inputs


.
def model ( temperature , altitude ):

# calculate modelled temperature


newtemperature = temperature - 0.00649 * altitude
# return this value to the script
return newtemperature

9
10
11
12
13
14
15

# Now let 's use our function to find some values


print model (283.15 , 1000)

16
17

print model (283.15 , 3500)

18
19

print model (271.15 , 712)

20
21
22

# When finished , exit the script


exit (0)
This example shows how to create a simple function and use it to print calculation results
to screen (Figure 9). Save this model as elr_model.py (File > Save As).

6.3 Loops
Sometimes it is necessary perform the same operation a number of times. Programming languages use loops for iteration. A loop is created to tell a script to perform the same operation
a set number of times. In this practical we concentrate on one type of loop, the for loop. The
for loop allows us to say for this many times, do some operation. Here were going to use
a for loop along with our ELR model to generate a modeled atmospheric temperature profile
matching our observed data in C:\temp\temperature_profile.csv, and then plot the observed
and modeled data to see the differences between them.

Version 1.0.1 (06-06-2011)

12

Introduction to numerical processing in Python using Spyder

Figure 9: Screenshot showing script with model function

1. Open elr_model.py and save a new copy called elr_model_generator.py. Were going
to add a for loop to use our model to generate an atmospheric profile from sea level to
11000m in 100m intervals using the range function. The range function instructs python
to return a list of a progression of numbers, which in this case will be altitude from 0 to
11000 in intervals of 100. Code inside the loop must have one tab-indentation from the
left margin. The loop should look like this:
1
2

# Create an empty Python list to hold our new temperatures


modeltemps = []

3
4
5

# Specify the air temperature at sea level (282.15 K / 9.0 C )


seatemp = 282.15

6
7
8
9
10
11

# Create a loop to model air temperatures of a range of heights


for alt in range (0 ,11000 ,100):
# for each altitude get the modelled temperature
newtemp = model ( seatemp , alt )
modeltemps . append ( newtemp )

12
13
14

# After the loop convert the list to a numerical array


modeltemps = np . array ( modeltemps )

Version 1.0.1 (06-06-2011)

13

Introduction to numerical processing in Python using Spyder

2. Heres the example of the complete script with the for loop:
1

# Script to calculate air temperature from ELR

2
3
4
5
6

# First import required modules


import sys , os
# Operating system modules
import numpy as np
# Numerical python
import matplotlib . pyplot as plt # Matplotlib plotting

7
8
9

# Now define our function . Variables in brackets are the inputs


.
def model ( temperature , altitude ):

10
11
12
13
14

# calculate modelled temperature


newtemperature = temperature - 0.00649 * altitude
# return this value to the script
return newtemperature

15
16

# Now let 's use our function to find some values

17
18
19

# Create an empty Python list to hold our new temperatures


modeltemps = []

20
21
22

# Specify the air temperature at sea level (282.15 K / 9.0 C )


seatemp = 282.15

23
24
25
26
27
28

# Create a loop to model air temperatures of a range of heights


for alt in range (0 ,11000 ,100) :
# for each altitude get the modelled temperature
newtemp = model ( seatemp , alt )
modeltemps . append ( newtemp )

29
30
31

# After the loop convert the list to a numerical array


modeltemps = np . array ( modeltemps )

32
33
34
35

# Display the new modelled data on screen


np . set_printoptions ( suppress = True )
print modeltemps

36
37
38

# When finished , exit the script


exit (0)
Save this model script as elr_model2 (File > Save As).

Version 1.0.1 (06-06-2011)

14

Introduction to numerical processing in Python using Spyder

6.4 Putting it all together


In the previous sections weve seen how to build from simple scripts into complex processes
using functions and loops. In this final section were going to put all the sections of the practical
together and create a script which reads the observed data from file on disk and generates our
modeled data using the model function and for loop. Finally the script will plot both the data
and save the plot to a file.
1

# Script to plot observed and modelled atmospheric profile of air


temperature

2
3
4
5
6

# First import required modules


import sys , os
# Operating system modules
import numpy as np
# Numerical python
import matplotlib . pyplot as plt # Matplotlib plotting

7
8
9
10
11
12
13

# Now define our function . Variables in brackets are the inputs .


def model ( temperature , altitude ):
# calculate modelled temperature
newtemperature = temperature - 0.00649 * altitude
# return this value to the script
return newtemperature

14
15

# Now let 's use our function to find some values

16
17
18

# Create an empty Python list to hold our new temperatures


modeltemps = []

19
20
21

# Specify the air temperature at sea level (282.15 K / 9 ,0 C )


seatemp = 282.15

22
23
24
25
26
27

# Create a loop to model air temperatures of a range of heights


for alt in range (0 ,11000 ,100) :
# for each altitude get the modelled temperature
newtemp = model ( seatemp , alt )
modeltemps . append ( newtemp )

28
29
30

# After the loop convert the list to a numerical array


modeltemps = np . array ( modeltemps )

31
32
33

# Read in observations data from file


temperatures = np . loadtxt (" C :/ temp / temperature_profile . csv " ,
delimiter =" ," , skiprows =1)

34
35
36

# Create a plot of observed temperatures


plt . plot ( temperatures [: ,1] , temperatures [: ,0] , "x" , color = " green " ,
label =" Observed Temperatures " )

37
38
39

# On the same plot add modelled temperatures ( using same altitude y


- axis )
plt . plot ( modeltemps , temperatures [: ,0] , " -" , color = " blue " , label ="
Modelled Temperatures " )

40
41

# Add axis labels and a legend


Version 1.0.1 (06-06-2011)

15

Introduction to numerical processing in Python using Spyder

42
43
44

plt . xlabel ( " Temperature (K) ")


plt . ylabel ( " Altitude (m) ")
plt . legend ( loc =" best ")

45
46
47

# Save the plot to PDF file


plt . savefig ("C :/ Temp / plot1 . pdf ")

48
49
50

# When finished , exit the script


exit (0)
Once complete you can view the result of the script by opening C:\Temp\plot1.pdf in a PDF
document viewer.

7 Going Further
In the previous example weve seen how to develop a script to plot the difference between
observed and modeled data. Scipy and Numpy also contain functions for statistical analysis.
One such example is the corrcoef function which generates a matrix of correlation coefficients
from two given data sets. To view the correlation coefficients between the observed and modelled data, add the following lines to your script before the exit(0) statement and run again.

...
corrcoefmatrix = np . corrcoef ( temperatures [: ,1] , modeltemps )
print corrcoefmatrix
...

Version 1.0.1 (06-06-2011)

16