Vous êtes sur la page 1sur 6

INDIAN NUCLEAR SOCIETY

13th Annual Conference – INSAC 2002


October 9-11, 2002, Mumbai.

Genetic Algorithms to Correct for Instrumental

Instabilities in IImpurity
mpurity Estimation by

Spectrochemical Analysis

S.V.G. Ravindranath
Spectroscopy Division
svgr@apsara.barc.ernet.in

and

A.P. Tiwari
Reactor Control Division
aptiwari@apsara.barc.ernet.in

Bhabha Atomic Research Centre, Trombay, Mumbai – 400 085

Back to Conference Programme page


Genetic Algorithms to Correct for Instrumental Instabilities in Impurity
Estimation by Spectrochemical Analysis
S.V.G. Ravindranath
Spectroscopy Division
svgr@apsara.barc.ernet.in
and
A.P. Tiwari
Reactor Control Division
aptiwari@apsara.barc.ernet.in
Bhabha Atomic Research Centre, Trombay, Mumbai – 400 085

SUMMARY
Of late genetic algorithms (GAs) are used in solving complex problems in science,
engineering, business and social sciences. GAs are population based parallel search
strategies based upon the Darwinian principle of biological evolution. GAs start with a
set of initial random population of solutions called generation. To produce next
generation the individual solutions are evaluated and selected according to their fitness.
These are transformed with genetically inspired operators such as crossover and
mutation. By repeating this procedure involving evaluation, selection, crossover and
mutation the GA will likely find a solution with a higher fitness value. In this paper
application of genetic algorithms for spectrochemical analysis has been described for
better estimation of impurities by correcting for instrumental instability. This has been
developed in MATLAB.

1.0 INTRODUCTION
Spectrochemical methods using instruments such as inductively coupled plasma
atomic emission spectrometers (ICP-AES) determine the trace level concentrations of
impurities in a given sample. Accuracy of the determinations in spectrochemcial analysis
is limited by two factors viz. spectral interference and instrumental instability. Spectral
interference occurs due to the overlapping of the analyte line by the neighbouring non-
analyte lines. Chemometric methods such as Kalman filters are available to tackle the
interference problems. Spectrometer instability over time causes spectral shift between
the pure component and sample scans. Shifts exceeding 0.1 pico meter (pm) affect the
accuracy of impurity determination. Simplex methods are available to correct for
instrumental instabilities. But simplex methods are sensitive to initial guesses and
become more complex with more number of parameters. In this paper application of GA
to correct for the instrumental instabilities is described. This has been tested with the data
available in literature. A brief introduction to GAs, the problem of instrumental
instability in estimation of Cd in As, implementation of GA and the results are given in
the sections following.

2.0 GENETIC ALGORITHMS


Genetic algorithms introduced by Holland comprise a set of initial random population
of solutions and biologically inspired operators like selection, crossover and mutation [1].
A typical GA cycle consists of the following steps:
• Creation of population strings
• Evaluation of each string
• Selection of best strings
• Genetic manipulation to create new population of strings
The population comprises of a group of potential solutions called chromosomes. Initially
population is generated randomly. A chromosome is usually expressed in a string of
variables, each element of which is called gene. The variable can be represented either by
binary, real or other forms and its range is usually problem specified. Bit string encoding
is the classical approach. Of late several researchers are using the other types of
representation too.
The fitness function is the main source to provide mechanism to evaluate the fitness
of each chromosome as a potential solution. The fitness values of all chromosomes are
evaluated by calculating the fitness function in a decoded form with respect to the
constraints imposed by the function.
Selection operator emulates nature’s policy of survival of fittest. Based on fitness
values selection operator selects the parents for mating process. There are many ways to
achieve effective selection such as ranking, tournament and Roulette wheel selection but
the essential assumption is to give preference to fitter individuals.
Crossover and mutation operators produce new population of individuals by
manipulating the genetic information referred to as genes possessed by members of
current generation. Crossover operator combines two subparts of two parent
chromosomes to produce the offspring that contain subparts of both parents’ genetic
material. Length of the subparts is chosen randomly. After crossover mutation operator
changes value of chromosome by changing the value of bit at randomly selected position.
Crossover and mutation operators are applied with probabilities Pc and Pm respectively
and generally Pm < Pc.

3.0 INSTRUMETNAL INSTABILITY


Instruments used in spectrochemical analysis such as sequential ICP-AES consist of a
scanning monochromator. Scanning monochromators are subject to drift. Drifts above
0.1 pm affect the accuracy of determination of impurities in a sample. In spectrochemical
analysis instrument is calibrated with standards consisting of known amount of analyte
concentrations against spectral line intensity. Using the calibration curve and measured
intensity the analyte concentration in a sample is determined. Because of the
spectrometer instability over time the standard and sample scans which were recorded
sequentially may have been shifted with respect to each other to an unknown amount.
van Veen et al [2] have taken up the classical case of Cd interfered by As and developed
a program to solve the interference problem by using Kalman filter technique and drift
problem by optimizing the peak distance by a version of simplex method. The spectral
drift problem was solved by optimizing the peak distance between spectral lines of Cd
and As with reference to sample scan in the spectral window at 228.802 nm. Sample
consists of both Cd and As. In this paper peak distance optimization has been carried out
using the genetic algorithms using the spectral data given in reference [3].

4.0 IMPLEMENTATION
The two parameters needed to be optimized are the peak distance between Cd and As
scans and the peak distance between the Cd and sample scans. These two parameters are
represented as d1 and d2. The scans were recorded with a scan step of 1.5 pm. The
maximum possible drift for these two parameters is taken as ± 5 steps. Implementation is
based on the steps and details as given by Michalewicz [3]. To represent the two
parameters as a string of binary numbers with a precision of three decimal places, each
requires a length of 14 bits. Thus each chromosome is a string of 28 binary bits. A
population of size 30 has been generated randomly. Determining the fitness of each
chromosome, the genes are decoded in to d1 and d2 and passed to a fitness function. First,
the fitness function effects the required shifts and reconstructs the best possible sample
spectrum using Kalman filtering. Then it returns the square root of the sum of the squares
of the difference between the corresponding data points in the constructed and original
sample scan as the fitness value of that particular chromosome. To treat the optimization
as a maximization problem the value returned by the fitness function is suitably modified.
All the members of the population are evaluated and their cumulative probabilities are
computed.
Roulette wheel selection mechanism is applied to select the population for the next
generation. The crossover operator is applied on the new population with Pc as 0.25.
This means that 25% of the chromosomes are expected to undergo the crossover. Later
the population undergoes the mutation operator with Pm as 0.01. This means 1% of the
bits of the population undergo the mutation. With this new generation of chromosomes
are ready for further evolution. This process is repeated for 100 generations. In each
generation, concentration values of Cd and As in the sample are computed for the best
chromosome. The best chromosome is allowed to pass to the next generation.

5.0 RESULTS
The constructed Cd spectrum with the fitted solution parameters and the observed Cd
spectrum are shown in Fig.1. This gives visual indication how close both spectra are. The
fitted spectrum is generated by subtracting the spectrum of As and the background
information computed by the Kalman filter routine. The evolution of concentration
values of Cd and As as generations progressed is shown in Fig.2. The concentrations
reached and stabilized at the expected values after 40 generations. This information could
be used as one of the criterion for termination of the optimization sequence. Due to the
randomness of the process any subsequent deviations stabilize soon as in the case of Cd
around 80th generation. The impurity estimation values with and without peak distance
optimization are tabulated in Table.1. These results are very close to those reported in
reference [2].
7000

0.8
6000
Cd
5000 Observed As/100
Fitted 0.7

Concentration
4000
Intensity

3000 0.6

2000

0.5
1000

0
0 10 20 30 40 50 60 0 20 40 60 80 100

Steps Generations

Fig.1. Fitted vs. observed Cd spectrum Fig.2. Evolution of concentrations

Table.1. Concentrations of Cd and As with and without optimization


Before After
Expected value
Element optimization optimization
(mg/ml)
(mg/ml) (mg/ml)
Cd 0.629 0.785 0.632
As 51.8 55.0 51.4

Though the GA approach takes longer time for computation, the possibility of
applying genetic algorithms with success opens up the possibilities of using GAs in more
complex situations where more number of lines interfere with the analyte line.

REFERENCE
1. K.K. Shukla, Neuro-Computers: Optimization based learning, Narosa Publishing
House, New Delhi, 2001
2. E.H. van Veen, S. Bosch and M.T.C. de Loos-Vollebregt, “The Kalman filter approach
to inductively coupled plasma atomic emission spectrometry”, Spectrochimica Acta,
Vol.49B, pp. 829, 1994
3. Z. Michalewicz, Genetic algorithms + Data structures = Evolution programs, 2nd Ed.,
Springer-Verlag, Berlin (1994)