Académique Documents
Professionnel Documents
Culture Documents
PHYS 224
October 1/2, 2015
Goals
Two things to teach in this lecture
1. How to use python to fit data
2. How to interpret what python gives you
Some references:
http://nbviewer.ipython.org/url/media.usm.maine.edu/~pauln/
ScipyScriptRepo/CurveFitting.ipynb
http://www.physics.utoronto.ca/~phy326/python/curve_fit_to_data.py
2
Fitting Experimental Data
The goal of the lab experiments is to determine a
physical quantity y (dependent variable) as a
function of x (independent variable)
How?
Measure the pair (xi,yi) a number (N) times
Find a fit function y=y(x) that describes the
relationship between these two quantities
3
The Linear Case
The simplest function relating the two
variables is the linear function
f(x) = y = ax +b
This is valid for any yi,xi combination
If a and b are known, the true value of yi
can be calculated for any xi
yi,true = axi + b
4
Linear Regression
5
An Example
Ideal Gas Law: P*V = n*R*T
Pressure * Volume = n * R * Temperature
P = [(n*R)/V]*T
6
Fitting in Python
Were going to use the curve_fit function, which is part
of the scipy.optimize package
The usage is as follows:
fit_parameters,fit_covariance = scipy.optimize.curve_fit(fit_function,x_data,y_data,sigma,guess)
7
Fitting with curve_fit
import numpy
import scipy.optimize
from matplotlib import peplos
8
Fitting with curve_fit
import numpy
import scipy.optimize
from matplotlib import peplos
}
#do the fit
fit_parameters,fit_covariance = scipy.optimize.curve_fit(linearFit, temp_data, vol_data,
}
p0=(1.0,8.0),sigma=uncertainty)
X data
}
}
9
Results
f it parameters = [0.21617647 8.33058824]
f it covariance = [[2.16490542e + 04 6.89053501e + 01]
[ 6.89053501e + 01 2.20507375e 01]]
10
How did it do this?
The function curve_fit uses a minimizer
This varies the fit parameters ( p[0] and
p[1] ) to see what value of these are most
likely to fit the data properly
This also depends on residuals, which
are the difference between the result of
the fit and the data at each point
Well discuss the quantity which is
minimized in a bit
11
Probability
The probability of any one point being
from the fit is
(y a bx)2
1 2
Pa,b (y) = p e 2 y
2 y
12
Full Probability
For a set of N measurements of the
dependent variable y
y1, y2, y3, yN
The probability of obtaining these values is
the product of the individual probabilities
Pa,b (y1 , y2 , y3 ...yN ) = Pa,b (y1 )Pa,b (y2 )Pa,b (y3 )...Pa,b (yN )
PN (yi a bxi )2
i=1 2
1 y
= N
e 2
13
Full Probability
For a set of N measurements of the
dependent variable y
y1, y2, y3, yN
The probability of obtaining these values is
the product of the individual probabilities
Pa,b (y1 , y2 , y3 ...yN ) = Pa,b (y1 )Pa,b (y2 )Pa,b (y3 )...Pa,b (yN )
PN (yi a bxi )2 Called the
chi-squared
i=1 2
1 y
= Ne 2
y ( 2)
14
Chi-Squared
N
X
2 (yi a bxi )2
= 2
i=1 y
15
Plotting the Residuals
#read in the data (currently only located on my hard drive...)
temp_data,vol_data = numpy.loadtxt('/Users/kclark/Desktop/Teaching/phys224/weather_data/
ideal_gas_law.txt',unpack=True)
16
How do the Residuals Look?
The residuals are
obviously a large
component of the
2 value used by
the minimizer
They can be
plotted to look for
trends and see if
the fit function is
appropriate
17
Other Results from curve_fit
18
Interpreting the Covariance Matrix
f it parameters = [0.21617647 8.33058824]
f it covariance = [[2.16490542e + 04 6.89053501e + 01]
[ 6.89053501e + 01 2.20507375e 01]]
19
Fit Results
import numpy
import scipy.optimize
from matplotlib import pyplot
#define the function to be used in the fitting, which is linear in this case
def linearFit(x,*p):
return p[0]+p[1]*x
20
#do the fit
Fit Results
fit_parameters,fit_covariance =
scipy.optimize.curve_fit(linearFit,temp_data,vol_data,p0=(1.0,8.0),sigma=uncertainty)
#calculate the data for the best fit minus one sigma in parameter #1
params_minus1sigma = numpy.array([fit_parameters[0],fit_parameters[1]-sigma1])
data_minus1sigma = linearFit(temp_data,*params_minus1sigma)
21
Fit Results
f it parameters = [0.21617647 8.33058824]
f it covariance = [[2.16490542e + 04 6.89053501e + 01]
[ 6.89053501e + 01 2.20507375e 01]]
22
Fit Results
f it parameters = [0.21617647 8.33058824]
f it covariance = [[2.16490542e + 04 6.89053501e + 01]
[ 6.89053501e + 01 2.20507375e 01]]
23
Comparison to Accepted Values
We obtained the result p[1] = 8.330.47
3
We assume that there is 1 mole in a 1m
volume so that n=V=1
The accepted value (currently) is
8.31446210.0000075
The accepted value IS contained within our
uncertainty (our one sigma range is from 7.86 to
8.80)
These values agree within their error
24
Application to Non-linear Examples
This method can also be applied to other examples
Powers: y = b x
2 2
can be linearized as y = b *x
2 3
Polynomials: y = a + b*x + c*x + d*x
This is just a case of using multiple regression
since the equation is linear in the coefficients
bx
Exponentials: y= a*e
Can be linearized as ln(y) = ln(a) + b*x
There are many other examples
25
Return to Chi-Squared
N
X 2
2 (yi y(xi ))
= 2
i=1 y
26
Gauss Distribution
28
Another example
29
Fitting the Gaussian
import numpy
import scipy.optimize
import matplotlib.pyplot as pyplot
import pylab as py
#define the function to be used in the fitting, which is linear in this case
def gaussFit(x,*p):
return p[0]+p[1]*numpy.exp(-1*(x-p[2])**2/(2*p[3]**2))
30
Another example
Mean
}
Standard Deviation
31
Another example
Mean
}
Standard Deviation
32
Another example
Mean
}
Standard Deviation
33
Another example
Mean
Rainfall of 85.5mm is
7.74 standard
deviations above the
mean (from this data)
}
Standard Deviation
which is extremely
unlikely
34
Chi-Squared and Goodness of Fit
N
X
2 (yi y(xi ))2
= 2
i=1 y
35
Chi-Squared
N
X
2 (yi y(xi ))2
= 2
i=1 y
#define the function to be used in the fitting, which is linear in this case
def linearFit(x,*p):
return p[0]+p[1]*x
dof = len(temp_data)-len(fit_parameters)
print dof
37
Revisit the First Example
Is this a good fit?
16
X 2
2 presDatai f iti
= = 65.6
i=1
uncertainty
39
Summary
40