Vous êtes sur la page 1sur 26

MAXIMUM LIKELIHOOD ESTIMATION OF

WEIBULL PROBABILITY DISTRIBUTION PARAMETER.


(2-PARAMETER).
BY

Jubril, hussein danesi


Ajibade, Bright F.
Backbone Resource Centre
Dept. of General studies 6, Taylor Drive, Off Edmund Crescent,
Petroleum training institute Medical Compound
Effurun, Warri P.M.B. 2023, Yaba,
Delta state. Lagos, Nigeria.
Email: equalright_bright@yahoo.com Email: danesi2002@hotmail.com

ABSTRACT
Weibull distribution plays an important role in failure distribution modeling in reliability studies.
It is a hard work to estimate the parameters of Weibull distribution. When the three-parameter
distribution is of interest, the estimation procedure will be quite boring and it has been seen that
the obtained estimators are not always available in a nice closed forms, although they can be
easily evaluated numerically. Maximum likelihood estimation is a good method, which is usually
used to elaborate on the parameter estimation. The likelihood function formed for the parameter
estimation of a three-parameter Weibull distribution is very hard to maximize.
In this work, a numerical example is presented to illustrate the principle of Newton-Raphson
approximation method for obtaining the parameter estimate for the 2-parameter Weibull
distribution.

1. INTRODUCTION

A moment of reflection on statistical problems encountered in the real world


should convince you that not all random variables fit the definition for discrete
random variables.

1
The type of random variable that takes on any value in an interval is called a
continuous random variable, and the purpose of this chapter is to study a particular
type of continuous random variable, which has a probability distribution function,
called the Weibull distribution function.

1.1 WEIBULL PROBABILITY DISTRIBUTION

In probability theory and statistics, the Weibull Distribution (named after


WALODDI WEIBULL) is a continuous probability distribution. It is often called
the Rosin–Rammler distribution when used to describe the particle size
distribution of particles.

The formula for the probability density function of the general Weibull probability
distribution is given as:

, x θ
Where:

K is the shape parameter, which is a pure number, that is, it is dimensionless.

θ is the location parameter, and

λ is the scale parameter.

The case where θ = 0 and λ = 1 is called the STANDARD WEIBULL


DISTRIBUTION. The case where θ = 0 is called the 2-parameter Weibull
distribution. The standard Weibull distribution becomes

2
f (x:k, λ)= ,x θ
Since the general form of probability functions can be expressed in terms of the
standard distribution, all subsequent formulas in this project are given for the
standard form of the distribution function.

1.2 USES OF WEIBULL DISTRIBUTION

The Weibull distribution is often used in the field of life data analysis due to its
flexibility—it can mimic the behavior of other statistical distributions such as the
normal and the exponential. If the failure rate decreases over time, then k < 1. If the
failure rate is constant over time, then k = 1. If the failure rate increases over time,
then k > 1.

An understanding of the failure rate may provide insight as to what is causing the
failures:

 A decreasing failure rate would suggest "infant mortality". That is, defective
items fail early and the failure rate decreases over time as they fall out of the
population.

 A constant failure rate suggests that items are failing from random events.

 An increasing failure rate suggests "wear out" - parts are more likely to fail
as time goes on.

The Weibull distribution is used

3
 In survival analysis
 To represent manufacturing and delivery times in industrial engineering

 In extreme value theory

 In weather forecasting

 In reliability engineering and failure analysis

 In radar systems to model the dispersion of the received signals level


produced by some types of clutters

 To describe wind speed distributions, as the natural distribution often


matches the Weibull shape

1.3 PROPERTIES

 The nth raw moment is given by:

Where Γ is the gamma function, define as:

Γ (a) = dt

 The mean is given as:

μ (x) = Γ ( )

 The variance is given as:

V(x) = Γ ( ) – Γ2 ( )

 The Cumulative distribution function is given as:

4
F(x) =1 – , x

2. INTRODUTION

In order to achieve the objective of a research, the researcher introduced the


research problem by reviewing the key writing of past researchers,
academicians and specialists in the subject matter.

2.1 LITERATURE REVIEW

In this section of the project, past study on the Weibull distribution is presented.

According to A. Khalili and K. Kromp (1991), in “Statistical Properties of


Weibull Estimators”, it was observe that when three different approaches was use
in the estimation of the Weibull estimators, namely linear regression, moments
method, and maximum likelihood method. The last of these was shown to be the
most appropriate approach for the whole range of sample sizes of 4 to 100 for
estimating the Weibull parameters. The data were produced by Monte Carlo
simulations. In each simulation, set of estimators were produced, and histograms of
the estimators were created, which shows the asymmetry of the Weibull modulus
distribution. The density functions were directly used to determine confidence
intervals for the estimated Weibull moduli. Furthermore it was reaffirmed that a
minimum of 30 samples are required for a good characterization.

In reviewing “Using Weibull Analysis for Evaluation of Cost and Schedule


Performance” from the Journal of construction Engrg. And management (Vol.
131, 2005). In the paper, it was noted that; large amounts of money are lost each
year in the construction industry because of poor schedule and cost control. This
paper presents a statistical approach, namely Weibull analysis. In this paper, the

5
applicability of Weibull analysis for evaluating and comparing the reliability of the
schedule performance of multiple projects is presented. The various steps in the
analysis are discussed along with an example in which two projects are analyzed
and compared. The authors conclude that Weibull analysis has several advantages
and provides a relatively robust and effective method for construction managers to
better control and monitor their projects.

2.2 HISTORICAL BRIEF OF THE WEIBULL PROBABILITY


DISTRIBUTION.

Ernst Hjalmar Waloddi Weibull (18 June 1887-12 October 1979) was a Swedish
engineer, scientist, and mathematician.

Weibull came from a family that had immigrated to Sweden in the 18th century
from Schleswig-Holstein, a region at the border between Denmark and Germany.

Weibull obtained his doctorate from the University of Uppsala in 1932. He was
employed in Swedish and German industry as an inventor and a consulting
engineer.

In 1914, Weibull wrote his first paper on the propagation of explosive waves. He
developed the technique of using explosive charges to determine the type of ocean
bottom sediments and their thickness. The same technique is still used today in
offshore oil exploration.

In 1939 he published his paper on Weibull distribution in probability theory and


statistics. In 1941 he received a personal research professorship in Technical
Physics at the Royal Institute of Technology in Stockholm from the arms producer
Bofors.

6
Dr. Weibull published many papers on strength of materials, fatigue, rupture in
solids, bearings, and of course, the Weibull distribution, as well as one book on
fatigue analysis (1) in 1961. Twenty seven of these papers were reports to the US
Air Force at Wright Field on Weibull analysis.

In 1951 he presented his most famous paper to the ASME on Weibull distribution,
using seven case studies.

The American Society of Mechanical Engineers awarded Dr. Weibull their gold
medal in 1972. The Great Gold medal from the Royal Swedish Academy of
Engineering Sciences was personally presented to him by King Carl XVI Gustaf of
Sweden in 1978.

Waloddi Weibull died on October 12, 1979 in Annecy, France.

7
3.0INTRODUTION

In this section, detailed examination of a mathematical properties of point


estimators, - particularly the Method of Maximum Likelihood, its important and
example of the method of estimation.

3.1 MAXIMUM LIKELIHOOD ESTIMATION (MLE)

Maximum likelihood estimation (MLE) is a popular statistical method used to


calculate the best way of fitting a mathematical model to some data. Modeling real
world data by estimating maximum likelihood offers a way of tuning the free
parameters of the model to provide an optimum fit.

The method was pioneered by geneticist and statistician Sir R. A. Fisher between
1912 and 1922. It has widespread applications in various fields, including:

 linear models and generalized linear models;


 exploratory and confirmatory factor analysis;

 structural equation modeling;

 psychometrics and econometrics;

 time-delay of arrival (TDOA) in acoustic or electromagnetic detection;

 data modeling in nuclear and particle physics;

 origin/destination and path-choice modeling in transport networks;

 Many situations in the context of hypothesis testing and confidence interval


formation.

Loosely speaking, for a fixed set of data and underlying probability model,
maximum likelihood picks the values of the model parameters that make the data
8
"more likely" than any other values of the parameters would make them: if a
uniform prior distribution is assumed over the parameters, these coincide with the
most probable values thereof. Maximum likelihood estimation gives a unique
solution in the case of the normal distribution, although in more complex problems
this may not be the case.

3.1.1 METHODS/PRINCIPLES OF MAXIMUM LIKELIHOOD


ESTIMATION (MLE)

Commonly, one assumes that the data drawn from a particular distribution are
independent, identically distributed (iid) with unknown parameters. This
considerably simplifies the problem because the likelihood can then be written as a
product of n univariate probability densities:

If x is a continuous random variable with pdf:

where θ1, θ2, ... θk are k unknown constant parameters that need to be estimated,
conduct an experiment and obtain N independent observations, x1, x2, ..., xN which
correspond in the case of life data analysis to failure times. The likelihood function
(for complete data) is given by:

9
The logarithmic likelihood function is:

The maximum likelihood estimators (MLE) of θ1, θ2 ... θk, are obtained by
maximizing L or Λ.

By maximizing Λ, which is much easier to work with than L, the maximum


likelihood estimators (MLE) of θ1, θ2, ... θk are the simultaneous solutions of k
equations such that:

This contrasts with seeking an unbiased estimator of θ, which may not necessarily
yield the MLE but which will yield a value that (on average) will neither tend to
over-estimate nor under-estimate the true value of θ.

Even though it is common practice to plot the MLE solutions using median ranks
(points are plotted according to median ranks and the line according to the MLE
solutions), this is not completely accurate. As it can be seen from the equations
above, the MLE method is independent of any kind of ranks. For this reason, many
times the MLE solution appears not to track the data on the probability plot. This is
perfectly acceptable since the two methods are independent of each other, and in
no way suggests that the solution is wrong.

Note that the maximum likelihood estimator may not be unique, or indeed may not
even exist in a nice closed form.

10
3.1.2 IMPORTANT OF MAXIMUM LIKELIHOOD ESTIMATION (MLE)

We have seen that the MLE depends on the sample observations only through the
value of a sufficient statistic.

To show this, we need only observe that, if U is a sufficient statistics for θ, the
factorization criterion implies that the likelihood can be factor as:

L (θ) = L(x1, x2…xn/θ)

=g (U, θ) h(x1, x2…xn)

Where g (U, θ) is a function of only U and θ and h(x1, x2…xn) does not depend on
θ. Then its follows that;

Ln [L (θ)] = ln [g (U, θ)] + ln [h(x1, x2…xn)]

Then, maximizing Ln [L (θ)] relative to θ is equivalent to maximizing ln [g (U, θ)]


relative to θ. Because ln [g (U, θ)] depend on the data only through the value of the
sufficient statistic U.

It follows that depend on the data only through the value of U. that is, if U is any

sufficient statistic for estimating θ, the maximum likelihood estimator (MLE) is


always some function of U.

This makes the method of maximum likelihood a very useful tool in finding
estimators with good properties.

11
4.0 INTRODUTION

Recall that the objective of statistics often is to make inferences about unknown
population parameters based on information contained in some data.

In this section we consider the general topic of estimation of the population


parameter in a Weibull probability distribution function by the method of
Maximum Likelihood Estimation (MLE).

4.1 WEIBULL ANALYSIS.

4.1.1 WEIBULL ANALYSIS (LIFE DATA ANALYSIS)

In life data analysis (also called "Weibull analysis"), the practitioner attempts to
make predictions about the life of all products in the population by "fitting" a
statistical distribution to life data from a representative sample of units. The
parameterized distribution for the data set can then be used to estimate important
life characteristics of the product such as reliability or probability of failure at a
specific time, the mean life for the product and failure rate. Life data analysis
requires the practitioner to:

 Gather life data for the product.


 Select a lifetime distribution that will fit the data and model the life of the
product.

 Estimate the parameters that will fit the distribution to the data.

12
4.1.2 Life Data
The term life data refers to measurements of the life of products. Product lifetimes
can be measured in hours, miles, cycles or any other metric that applies to the
period of successful operation of a particular product. Since time is a common
measure of life, life data points are often called "times-to-failure" and product life
is usually described in terms of time. There are different types of life data and
because each type provides different information about the life of the product, the
analysis method will vary depending on the data type. With complete data, the
exact time-to-failure for the unit is known (e.g. the unit failed at 100 hours of
operation). With suspended or right censored data, the unit operated successfully
for a known period of time and then continued (or could have continued) to operate
for an additional unknown period of time (e.g. the unit was still operating at 100
hours of operation). With interval and left censored data, the exact-time-to failure
is unknown but it falls within a known time range. For example, the unit failed
between 100 hours and 150 hours (interval censored) or between 0 hours and 100
hours (left censored).

4.1.3 Lifetime Distributions


Statistical distributions have been formulated by statisticians, mathematicians and
engineers to mathematically model or represent certain behavior. The probability
density function (pdf) is a mathematical function that describes the distribution.

Some distributions, like the Weibull and lognormal, tend to better represent life
data and are commonly called lifetime distributions or life distributions. In fact,
life data analysis is sometimes called "Weibull analysis" because the Weibull
distribution, formulated by Professor Wallodi Weibull, is a popular distribution for
analyzing life data. The Weibull distribution can be applied in a variety of forms
(including 1-parameter, 2-parameter, 3-parameter or mixed Weibull) and other
13
common life distributions include the exponential, lognormal and normal
distributions. The analyst chooses the life distribution that is most appropriate to
each particular data set based on past experience and goodness of fit tests.

4.1.4 Parameter Estimation


In order to "fit" a statistical model to a life data set, the analyst estimates the
parameters of the life distribution that will make the function most closely fit the
data. The parameters control the scale, shape and location of the pdf function. The
scale parameter, λ (lambda), defines where the bulk of the distribution lies. The
shape parameter, k, defines the shape of the distribution and the location parameter,
θ (theta), defines the location of the distribution in time.

Several methods have been devised to estimate the parameters that will fit a
lifetime distribution to a particular data set. Some available parameter estimation
methods include: probability plotting, rank regression on x (RRX), rank regression
on y (RRY) and maximum likelihood estimation (MLE).
In the course of this project work, the method of estimation use in the parameter
estimation of the Weibull distribution is the maximum likelihood estimation
(MLE).

4.2 ANALYSIS-USING MAXIMUM LIKELIHOOD ESTIMATION


TO CALCULATE THE PARAMETER OF THE WEIBULL
PROBABILITY DISTRIBUTION.

As outlined in the method maximum likelihood estimation, that its involves taking
the partial derivatives of the likelihood function with respect to the parameters,
setting the resulting equations equal to zero and solving simultaneously to
determine the values of the parameter estimates.

14
Another method of finding the parameter estimates works by developing a
likelihood function based on the available data and using iterative methods to
determine the values of the parameter estimates that maximize the likelihood
function, but this can be rather difficult and time consuming, particularly when
dealing with the three parameter Weibull distribution.

Therefore, in this section an attempt is made to derive the MLE of the 2-parameter
Weibull distribution, and using the Newton-Raphson iterative method based on
available data to determine the values of the parameter estimates that maximize the
likelihood function. Taken into account the attractive properties that make the MLE
a particularly attractive method of estimation and also the characteristics of the
Weibull distribution,
Let X1, X2…Xn denote the data from a random sample, and we assume that Xi’s are
independent random variable. And if they all follows the 2-parameter Weibull
distribution with the same parameter, given as:

f (Xi; k, λ) =

Where, Xi > 0 = Time to failure

K= Shape parameter

λ= Scale parameter,

Their joint probability distribution is given as:

15
f (X1,X2…Xn/k, λ) =

The log-likelihood function is given as;

f (k, λ ; X1,X2…Xn)=

To maximize this function we require the derivative with respect to λ; that is,

=U= .

And equation 4.1 above can be written as:

f (X1,X2…Xn/ λ) = exp

In the form of:

f (X1,X2…Xn/ λ) = exp

Where xk, - λ-k, and

Where k is a nuisance parameter.

16
From the above we can say that the 2-parameter Weibull distribution belong to the
exponential family of distribution, hence its satisfy some properties associated with
a good parameter estimate, that is, completeness, sufficiency, consistency,
efficiency and unbiasedness.

The maximum likelihood estimator λ is the solution of the equation U (λ) =0,
where

= U(λ)=

= +

In this case it is easy to find an explicit expression for if k is a known constant,

but with the illustrative example given we will obtain a numerical solution using
the Newton-Raphson approximation method.

4.3 NEWTON-RAPHSON APPROXIMATION METHOD.

Attempt to find the value of say x at which the function t(x) crosses the x-axis, that
is where t(x) =0, from the figure below:

17
By the principle of Newton-Raphson algorithm, the slope of t at a value of x (m-1) is
given by:

x= x (m-1) = t’(x (m-1))

Where the distance is small. If is the required solution, so that

t( ) =0, then equation 4.5 can be re-arranged to give

This is the Newton-Raphson formula for solving t(x) =0.


Starting with an initial guess of x (1) successive approximations are obtained using
equation 4.5 until the iterative process converges.

From the maximum likelihood estimate of the 2-parameter Weibull, the estimating
equation equivalent to 4.5 is

18
From 4.4, that is

U (λ) =

= +

Which is evaluates at successive estimates , with an initial guess of . The

derivative of U, obtained by differentiating 4.3 is

= U’ =

= –

4.4 NUMERICAL EXAMPLE TO ILLUSTRATE THE PRINCIPLE.

Given a complete time-to-failure dataset, the table below contains the data from a life test on n=6 shock
absorbers as;

TIME-TO-FAILURE (X hrs) X1.9

16 194.0117

34 812.4748

53 1888.5311

19
75 3652.7202

93 5496.8970

120 8921.6333

TOTAL

TABLE 4.1, LIFE TIME TEST OF SHOCK ABSORBERS.

Fitted Weibull Distribution

50

40

30
percentage

20

10

0
1 10 100 1000
TIME-TO-FAILURE

Figure 4.1
This graph shows a frequency histogram for Col_1. In this plot, 4 intervals have
been formed ranging from a lower limit of 0.0 to an upper limit of 150.0. The

20
number of data values in each interval has then been tabulated. The plot shows
these frequencies. In addition, the probability density function for the fitted
Weibull distribution has been superimposed on the histogram. If the distribution
fits well, the top of the bars should be relatively close to the line.

Figure 4.2
The Weibull plot is designed to help you determine whether your data can be
reasonably modeled using a Weibull distribution. The data values are plotted along
the horizontal axis, using a logarithmic scale. Along the vertical axis are the

21
median ranks corresponding to each data value. If the data values are well-
described by a Weibull distribution, then the plotted points should fall
approximately along a straight line.

The plot also includes a line to help you judge whether or not the Weibull
distribution is appropriate. The intercept and slope of this line are based on the
shape and scale parameters as estimated by the method of maximum likelihood.

From table 4.1 above, we obtain fig. 4.1 which shows the shape of the distribution
and fig.4.2 which is the probability plot of the given data, and they appears to
provide a good model for the Weibull distribution. We then use a Weibull
distribution with the shape parameter given as k=1.9 and the value of the scale
parameter is then estimated by the Newton-Raphson approximation methods, as
follows:

From the maximum likelihood estimation given above, the estimating equation is:

, where

= + , and

= –

With an initial guess of , k=1.9 and n=6, we have;

For m=2

22
=

= –

Then,

Therefore, from the computation above, after evaluating at successive estimates of

, we obtain the table below:

1 - - 69.00

4 4.19158413*10^(-5)

5 (-9.25468)*10^(-8)

23
From the above, we can conclude that, at m=5 the iterative process converges.
Thus the value of the parameter that maximize the likelihood function are given as
k=1.9 (shape parameter) and λ=73.2709 (scale parameter).

4.5 INTERPRETATION OF RESUITS

Some statistical computer software attempts to find a solution in all of the regions
of the given data using a variety of methods, but the user should be forewarned that
not all possible data can be addressed. Thus, some solutions using MLE for the
three-parameter Weibull will fail when the algorithm has reached predefined limits
or fails to converge.

Aside making use of the software, it should be pointed out that the solution to the
three-parameter Weibull distribution via MLE is not always stable and can collapse
if k ~ 1. In estimating the true MLE of the three-parameter Weibull distribution,
two difficulties arise. The first is a problem of "non-regularity" and the second is
the "parameter divergence problem".

24
Non-regularity occurs when k 2. In general, it should be noted that there are no
MLE solutions in the region of 0 < k < 1. And when 1 < k < 2, the MLE solutions
exist but are not asymptotically normal [MLE is asymptotically normal if as the
number of samples increases, the distribution of the MLE tends to the Gaussian
distribution]. In the case of non-regularity, the solution is treated anomalously.

And for the 2-parameter Weibull, the likelihood function is form based on available
data and the Newton-Raphson iterative method is then use to determine the values
of the parameter estimates that maximize the likelihood function.

4.6 REFERENCES CITED AND BIBILOGRAPHY.

 Annette, J. D., (2001). Introduction to Generalized linear models. 2nd


edition. CRS press.
 Bain, L. J. (1978). Statistical analysis of reliability and life-testing models.
Decker: New York.

 Bain, L. J. and Engelhart, M. (1989) Introduction to Probability and


Mathematical Statistics. PWS: Kent, MA.

 Balkema, A., and Laurens de Haan (1974). Residual life time at great age,
Annals of Probability, 2, 792-804.

 Barlow, R. E., & Proschan, F. (1975). Statistical theory of reliability and life
testing. Holt, Rinehart, & Winston: New York.

 Dodson, B. (1994). Weibull analysis. ASQC: Milwaukee, Wisconsin.

25
 Embrechts, P., C. Klüppelberg, and T. Mikosch (1997) Modeling extremely
events for insurance and finance. Spring Verlag: Berlin.

 Gumbel, E.J. (1958). Statistics of Extremes. Columbia University Press.

 Harville, D. A. (1977). Maximum likelihood approaches to variance


component estimation and to related problems. Journal of the American
Statistical Association, 72, 320-340.

 Hogg, R. V., & Craig, A. T. (1970). Introduction to mathematical statistics.


Macmillan: New York.

 Kasumu, R. B. (2002). Elements of statistical inference. Jas Publishers.

 Lee, E. T. (1980). Statistical methods for survival data analysis: Lifetime


Learning. Belmont, CA.

 Lieblein, J. (1955). On moments of order statistics from the Weibull


distribution. Annals of Mathematical Statistics, 26, 330-333.

 Meeker, W.Q., and Escobar, L. A. (1998). Statistical methods for Reliability


Data. John Wiley & son, Inc., New York.

 Pickands, J. (1975). Statistical inference using extreme order statistics,


Annals of Statistics, 3, 119-131.

 The User’s Guide to STATGRAPHICS Centurion XV.


 Wilks, S. S. (1946). Mathematical statistics. Princeton University Press.
Princeton, NJ.

26

Vous aimerez peut-être aussi