Vous êtes sur la page 1sur 20

Advanced Review

Geometry optimization
H. Bernhard Schlegel∗

Geometry optimization is an important part of most quantum chemical calcu-


lations. This article surveys methods for optimizing equilibrium geometries, lo-
cating transition structures, and following reaction paths. The emphasis is on
optimizations using quasi-Newton methods that rely on energy gradients, and
the discussion includes Hessian updating, line searches, trust radius, and rational
function optimization techniques. Single-ended and double-ended methods are
discussed for transition state searches. Single-ended techniques include quasi-
Newton, reduced gradient following and eigenvector following methods. Double-
ended methods include nudged elastic band, string, and growing string methods.
The discussions conclude with methods for validating transition states and fol-
lowing steepest descent reaction paths. C 2011 John Wiley & Sons, Ltd. WIREs Comput Mol
Sci 2011 1 790–809 DOI: 10.1002/wcms.34

INTRODUCTION be reviewed elsewhere in this series. Global mapping


of reaction networks21,22 is outside the scope of this
G eometry optimization is a key component of
most computational chemistry studies that are
concerned with the structure and/or reactivity of
chapter. Transition path sampling and reaction paths
on free energy surfaces are also not covered.23 Dis-
molecules. This chapter describes some of the meth- cussions of optimization methods to find conical in-
ods that are used to optimize equilibrium geometries, tersections and minimum energy seam crossings can
locate transition structures (TSs), and follow reac- be found elsewhere.22,24–36 Here, we focus on finding
tion paths. Several surveys and reviews of geometry the equilibrium geometry of an individual molecule,
optimization are available.1–14 Rather than an exten- locating a TS for a reaction, and following the reac-
sive review of the literature, this article is directed to- tion path connecting reactants through a TS to prod-
ward practical geometry optimization methods that ucts. The geometry optimization methods discussed
are currently in use. In particular, it is concerned with in this chapter use energy derivatives and depend on
geometry optimization methods applicable to elec- factors such as the choice of coordinates, methods for
tronic structure calculations.15,16 Because electronic calculating the step direction, Hessian updating meth-
structure calculations can be lengthy, geometry opti- ods, line search, and step size control strategies. Some
mization methods need to be efficient and robust. For of these methods have been compared recently37 for
inexpensive molecular mechanics calculations, sim- modest set of test cases for optimizing equilibrium
pler methods may be adequate. Most major electronic geometries38 and TSs.39
structure packages have a selection of geometry op-
timization algorithms. By discussing the components
POTENTIAL ENERGY SURFACES
of various optimization methods we hope to aid the
reader in selecting the most appropriate optimization The structure of a molecule can be specified by giv-
methods and in overcoming the occasional difficulties ing the locations of the atoms in the molecule. For a
that may arise. Unconstrained nonlinear optimization given structure and electronic state, a molecule has a
methods have been discussed extensively in numeri- specific energy. A potential energy surface describes
cal analysis texts (e.g., Refs 17–20). These methods how the energy of the molecule in a particular state
are designed to find a local minimum closest to the varies as a function of the structure of the molecule.
starting point. Global optimization and conforma- A simple representation of a potential energy surface
tional searching are more difficult problems and will is shown in Figure 1, in which the energy (vertical
coordinate) is a function of two geometric variables

Correspondence to: hbs@chem.wayne.edu (the two horizontal coordinates).
Department of Chemistry, Wayne State University, Detroit, MI, The notion of molecular structure and po-
USA tential energy surfaces are outcomes of the
DOI: 10.1002/wcms.34 Born–Oppenheimer approximation, which allows us

790 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization

uct along the reaction path) and a minimum in all


other directions (directions perpendicular to the reac-
tion path). A TS is also termed as a first-order saddle
point. In Figure 1, it can be visualized as a mountain
pass connecting two valleys. A second-order saddle
point (SOSP) is a maximum in two directions and a
minimum in all the remaining directions. If a reaction
path goes through an SOSP, a lower energy reaction
path can always be found by displacing the path away
from the SOSP. An n-th order saddle point is a max-
imum in n directions and a minimum in all the other
directions.
For a thermally activated reaction, the energy of
the TS and the shape of the potential energy surface
around the TS can be used to estimate the reaction rate
F I G U R E 1 | Model potential energy surface showing minima, (see other reviews in this series). The steepest descent
transition structures, second-order saddle points, reaction paths, and a reaction path (SDP) from the TS down to the reactants
valley ridge inflection point (Reprinted with permission from Ref 5.
and to the products is termed the minimum energy
Copyright 1998 John Wiley & Sons.)
path (MEP) or the intrinsic reaction coordinate (IRC;
the MEP in mass-weighted coordinates). The reaction
to separate the motion of the electrons from the mo- path from reactants through intermediates (if any) to
tion of the nuclei. Because the nuclei are much heav- products describes the reaction mechanism.41 A more
ier and move much more slowly than the electrons, detailed description of a reaction can be obtained
the energy of a molecule in the Born–Oppenheimer by classical trajectory calculations42–45 that simulate
approximation is obtained by solving the electronic molecular dynamics by integrating the classical equa-
structure problem for a set of fixed nuclear positions. tions of motion for a molecule moving on a potential
Because this can be repeated for any set of nuclear energy surface. Photochemistry involves motion on
positions, the energy of a molecule can be described multiple potential energy surfaces and transitions be-
as a parametric function of the position of the nuclei, tween them (see Ref 46).
thereby yielding a potential energy surface.
A potential energy surface, like the one shown in
Figure 1, can be visualized as a hilly landscape, with
valleys, peaks, and mountain passes. Even though
ENERGY DERIVATIVES
most molecules have many more than two geomet- The first and second derivatives of the energy with
ric variables, most of the important features of a po- respect to the geometrical parameters can be used to
tential energy surface can be represented in such a construct a local quadratic approximation to the po-
landscape. tential energy surface:
The valleys of a potential energy surface repre-
sent reactants, intermediates, and products of a reac- E(x) = E(x0 ) + g0T x + 1/2xT H0 x (1)
tion. The position of the minimum in a valley repre-
sents the equilibrium structure. The energy difference where g0 is the gradient (dE/dx) at x0 , H0 is the Hes-
between the product valley and reactant valley min- sian (d2 E/dx2 ) at x0 , and x = x−x0 . The gradient
ima represents the energy of the reaction. Vibrational and Hessian can be used to confirm the character of
motion of the molecule about the reactant and prod- minima and TSs. The negative of the gradient is the
uct equilibrium geometries can be used to compute vector of forces on the atoms in the molecule. Be-
zero-point energy and thermal corrections needed to cause the forces are zero for minima, TSs, and higher-
calculate enthalpy and free energy differences.40 The order saddle points, these structures are also termed
lowest energy pathway between the reactant valley stationary points. The Hessian or matrix of second
and the product valley is the reaction path.41 The derivatives of the energy is also known as the
highest point on this lowest energy reaction path is force constant matrix. The eigenvectors of the mass-
the TS for the reaction, and the difference between weighted Hessian in Cartesian coordinates corre-
the energy of the TS and the reactant is the energy spond to the normal modes of vibration (plus five or
barrier for the reaction. A TS is a maximum in one six modes for translation and rotation).47 For a struc-
direction (the direction connecting reactant and prod- ture to be characterized as a minimum, the gradient

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 791
Advanced Review wires.wiley.com/wcms

must be zero and all of the eigenvalues of the Hessian small eigenvalues of the Hessian, (i.e., the Hessian is
corresponding to molecular vibrations must be pos- an ill-conditioned matrix). Strong coupling between
itive; equivalently, the vibrational frequencies must coordinates can also slow down an optimization. This
be real (the vibrational frequencies are proportional corresponds to off-diagonal Hessian matrix elements
to the square root of the eigenvalues of the mass- that are comparable in magnitude to the diagonal el-
weighted Hessian). For a TS, the potential energy ements. Strong anharmonicity can seriously degrade
surface is a maximum in one direction (along the the performance of an optimization. If the Hessian
reaction path) and a minimum in all other perpen- changes rapidly when the geometry of the molecule is
dicular directions. Therefore, a TS is characterized by changed, or if the valley around a minimum is strongly
a zero gradient and a Hessian that has one (and only curved, then the quadratic expression in Eq. (1) is a
one) negative eigenvalue; correspondingly, a TS has poor approximation to the potential energy surface
one and only one imaginary vibrational frequency. An and the optimization will be slow to converge. The
n-th order saddle point (also called a stationary point nature of the Hessian and the anharmonicity of the
of index n) has a zero gradient and is a maximum in potential energy surface will be directly affected by
n orthogonal directions and hence has n imaginary the choice of the coordinate system.
frequencies. For a TS, the vibrational mode corre- There are a number of coordinate systems that
sponding to the imaginary frequency is also known are typically used for geometry optimization. Carte-
as the transition vector. At the TS, the transition vec- sian coordinates are perhaps the most universal and
tor is tangent to the reaction path in mass-weighted the least ambiguous. An advantage is that most energy
coordinates. and derivative calculations are carried out in Carte-
Most methods for efficient geometry optimiza- sian coordinates. However, they are not well suited
tion rely on first derivatives of the energy; some also for geometry optimization because they do not reflect
require second derivatives. For most levels of theory the ‘chemical structure’ and bonding of a molecule.
used routinely for geometry optimization, the first The x, y, and z coordinates of an atom are strongly
derivatives can be calculated analytically at a cost coupled to each other and to the coordinates of neigh-
comparable to that for the energy. Analytic second boring atoms.
derivatives are also available for several levels of the- Internal coordinates such as bond lengths and
ory, but the cost is usually considerably higher than valence angles are more descriptive of the molecular
for first derivatives. With the possible exception of structure and are more useful for geometry optimiza-
optimization of diatomic molecules, derivative-based tion. Bond stretching requires more energy than angle
geometry optimization methods are significantly more bending or torsion about single bonds. More impor-
efficient than energy-only algorithms. If analytic first tantly, the coupling between stretches, bends, and tor-
derivatives are not available, it is possible to use sim- sions are usually much smaller than between Carte-
plex and pattern search methods,48–50 but these be- sian coordinates. In addition, internal coordinates are
come less efficient as the number of degree of freedom much better than Cartesians for representing curvilin-
increases.51 Thus, it may be more efficient to compute ear motions such as valence angle bending and rota-
gradients numerically and to use a gradient-based tion about single bonds. For an acyclic molecule with
optimization algorithm than to use an energy-only N atoms, it is easy to select set of 3N−6 internal coor-
algorithm. dinates to represent the molecule (3N−5 coordinates
for a linear molecule). Z-matrix coordinates are an ex-
ample of such a coordinate system.52 It is straightfor-
ward to convert geometries and derivatives between
COORDINATES Cartesian and Z-matrix internal coordinates.53
In principle, any complete set of coordinates can be For acyclic molecules, the set of all bonds, an-
used to represent a molecule and its potential energy gles, and torsions represents the intrinsic connectiv-
surface. However, choosing a good coordinate system ity and flexibility of the molecule. However, for a
can significantly improve the performance of geome- cyclic molecule, this introduces more than the 3N−6
try optimizations. Inspection of the Hessian used in coordinates required to define the geometry of
the local quadratic approximation to the potential the molecule. Such a coordinate system has a
energy surface, in Eq. (1), can reveal some favorable certain amount of redundancy in the geometric
aspects of a good coordinate system. For example, an parameters.53–68 Because only 3N−6 of these redun-
optimization will be less efficient if there are some very dant internal coordinates can be transformed back
stiff coordinates and some very flexible coordinates. to Cartesian coordinates in three dimensions, cer-
This corresponds to a mixture of very large and very tain combinations of the redundant internals must

792 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization

be constrained during the optimization. The set of all Newton or quasi-Newton steps are required to reach a
bonds, valence angles, and torsions (if necessary, aug- stationary point. For minimization, the Hessian must
mented by out-of-plane bends and linear bends) con- have all positive eigenvalues (i.e., positive definite).
stitutes a primitive redundant coordinate system.53,58 If one or more eigenvalues are negative, the step will
In some cases, it may be advantageous to form lin- be toward a first or higher-order saddle point. Thus,
ear combinations of the primitive redundant inter- without some means of controlling the step size and
nals to form natural or delocalized redundant internal direction, simple Newton steps are not robust. Sim-
coordinates54–57 or symmetry-adapted redundant in- ilarly, if the aim is to optimize to a TS, the Hessian
ternal coordinates. For periodic systems such as solids must have one and only one negative eigenvalue, and
or surfaces, unit cell parameters need to be added66–68 the corresponding eigenvector (i.e., the transition vec-
(either explicitly or implicitly via coordinates that tor) must be roughly parallel to the reaction path.
cross the boundaries of the unit cell). For molecules Methods for ensuring that the step is in the desired
in nonisotropic media, additional coordinates are direction for minimization or TS optimization are dis-
needed to specify the orientation of the molecule. For cussed in the section dealing with step size control.
systems containing more than one fragment, addi- At each step, Newton methods require the ex-
tional coordinates are required to specify the positions plicit calculation of the Hessian, which can be rather
of the fragments relative to each other. The union of costly. Quasi-Newton methods start with an inex-
the redundant internal coordinates for the reactants pensive approximation to the Hessian. The difference
and products is usually a good coordinate system for between the calculated change in the gradient and
TS optimization.58 The transformation of Cartesian the change predicted with the approximate Hessian
coordinates and derivatives to redundant internals is is used to improve the Hessian at each step in the
straightforward, but the back transformation of a fi- optimization.17–20
nite displacement of redundant internals to Cartesian
usually is solved iteratively.53–68 Hnew = Hold + H (4)
For a quadratic surface, the updated Hessian
must fulfill the Newton condition,
NEWTON AND QUASI-NEWTON
METHODS g = Hnew x, (5)

As described in standard texts on optimization,17–20 where g = g(xnew ) − g(xold ) and x = (xnew − xold ).
most nonlinear optimization algorithms are based on However, there are an infinite number of ways to
a local quadratic approximation of the potential en- update the Hessian and fulfill the Newton condition.
ergy surface; Eq. (1). Differentiation with respect to One of the simplest updates is the symmetric rank
the coordinates yields an approximation for the gra- one (SR1) update,17–20 also known as the Murtagh–
dient, given by: Sargent update.69

g(x) = g0 + H0 x. (2) (g − Hold x) (g − Hold x)T


HSR1 = (6)
(g − Hold x)T x
At a stationary point, the gradient is zero, g(x) =
0; thus, in the local quadratic approximation to the This formula can encounter numerical problems
potential energy surface, the displacement to the min- if |g − Hold x| is very small. The Broyden family
imum is given by: of updates70 avoids this problem while ensuring that
the Hessian update is positive definite. The Broyden–
x = −H−1
0 g0 . (3)
Fletcher–Goldfarb–Shanno (BFGS) update70–73 is the
This is known as the Newton or Newton–Raphson most successful and widely used member of this
step. family:
Newton and quasi-Newton methods are the g gT Hold x xT Hold
most efficient and widely used procedures for opti- HBFGS = − . (7)
gT x xT Hold x
mizing equilibrium geometries and can also be used
effectively to find TSs. For each step in the Newton For TS optimization, it is important that the
method, the Hessian in Eq. (3) is calculated at the cur- Hessian has one and only one negative eigenvalue.
rent point. For quasi-Newton methods, Eq. (3) is used This should be checked at every step of the opti-
with an approximate Hessian that is updated at each mization. If the Hessian does not have the correct
step of the optimization (see below). Because actual number of negative eigenvalues, the eigenvalues need
potential energy surfaces are rarely quadratic, several to be shifted or one of the methods for step size control

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 793
Advanced Review wires.wiley.com/wcms

needs to be used (see Step Size Control). The initial es- this may be a bottleneck. The updating methods can
timate of the Hessian for TS optimizations must have be reformulated to update the inverse of the Hessian.
one negative eigenvalue and the associated eigenvec- For example, the BFGS formula for the update of the
tor should be approximately parallel to the reaction inverse Hessian is:
path. The Hessian update should not be forced to xxT Bold ggT Bold
be positive definite. The Powell-symmetric-Broyden BBFGS = − , (11)
xT g gT Bold g
(PSB) update18 fulfills this role:
(g − Hold x)xT + x (g − Hold x)T where B = H−1 , and the updated inverse Hessian
H PSB
= obeys x = Bg. Limited memory quasi-Newton
xT x
methods such as L-BFGS80–89 avoid the storage of the
(xT (g − Hold x)) x xT full Hessian or its inverse which would require O(n2 )
− . (8)
(xT x)2 memory for n variables. Instead, they start with a di-
agonal inverse Hessian, and store only the x and g
Bofill74 found that a combination of the PSB and
vectors from a limited number of previous steps; thus,
SR1 updates performs better for TS optimizations:
the storage is only O(n). The inverse Hessian is writ-
HBofill = φHSR1 + (1 − φ)HPSB , ten as a diagonal Hessian plus the updates using the
((g − Hold x)T x) 2 stored vectors. For the Newton step, xnew = xold −
where φ= . (9) B gold , the product of the updated inverse Hessian
|g − Hold x|2 |x|2
and the gradient involves only O(n) work because it
Farkas and Schlegel75 constructed a similar up- can be expressed in terms of dot products between
date for minimization: vectors.
HFS = φ 1/2 HSR1 + (1 − φ 1/2 )HBFGS , (10)
where φ is same as in Eq. (9). Bofill76 has recently re- CONJUGATE GRADIENT METHODS
viewed Hessian updating methods and proposed some Conjugate gradient methods are suitable for very
promising new formulas suitable for minima and TSs. large systems because they require less storage than
For controlling the step size by trust radius (τ ) meth- limited memory quasi-Newton methods. The concept
ods or rational function optimization (see Step Size behind conjugate gradient methods is to choose a
Control), diagonalizing the Hessian can become a new search direction that will lower the energy while
computational bottleneck for large systems. Updat- remaining at or near the minimum in the previous
ing the eigenvectors and eigenvalues of the Hessian search direction. If the Hessian has coupling between
can be considerably more efficient.77 the coordinates, the optimal search directions are not
Quasi-Newton methods require an initial esti- orthogonal but are conjugate, in the sense that xnew
mate of the Hessian. A scaled identity matrix may Hxold = 0. Two of the most frequently used con-
be sufficient in some cases, but a better starting Hes- jugate gradient methods are Fletcher–Reeves90 and
sian can be obtained from knowledge of the structure Polak-Ribiere91 :
and connectivity of the molecule. Simple empirical
giT gi
estimates of stretching, bending, and torsional force si = − gi + T
si−1 (12)
constants are usually satisfactory.78–80 Calculation of gi−1 gi−1
an initial Hessian by molecular mechanics or semi- (gi − gi−1 )T gi
empirical electronic structure methods can provide a si = − gi + T
si−1 , (13)
gi−1 gi−1
better estimate. If the molecule has been optimized at
a lower level of theory, the updated Hessian from the where si and si−1 are the current and previous search
optimization or the Hessian from a frequency calcu- directions, respectively, and gi and gi−1 are the current
lation at a lower level of theory are even better initial and previous gradients, respectively. Note that only
estimates. For harder cases, the rows and columns three vectors need to be stored. Unlike quasi-Newton
of the Hessian can be calculated numerically for a methods, conjugate gradient methods require a line
few critical coordinates. For more difficult cases, the search at each step in order to converge.
full Hessian can be calculated at the beginning of the
optimization and recalculated at every few step if nec-
essary. Recalculating the Hessian at each step corre-
GDIIS
sponds to the Newton method. Direct inversion of the iterative subspace (DIIS) con-
Standard quasi-Newton methods store and in- structs a solution to a set of linear equations such
vert the full Hessian. For large optimization problems, as Eq. (3) by expanding in a set of error vectors or

794 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization

residuals from prior iterations. Geometry optimiza- can be done with little or no extra cost. The line search
tion by DIIS (GDIIS) applies this approach to the need not be exact but must fulfill the Wolfe condition,
optimization of equilibrium geometries and transi- E < α gT x and gnew T x > β gold T x, reducing the
tion structures.92–95 In the numerical analysis liter- function value and the magnitude of the gradient (α ≈
ature, DIIS is known as generalized minimum of the 0.1 and β ≈ 0.5 have been found to be practical for
residual (GMRES).96 It has its origins in methods like quasi-Newton optimizations18 ). A simple approach
the Krylov,97 Lanczos,98 and Arnoldi99 algorithms. is to fit a polynomial to the energy and gradient at
In GDIIS, the goal is to construct a new geometry as the current and the previous geometry (a cubic poly-
a linear combination of previous geometries so as to nomial or a quartic polynomial constrained to have
minimize the size of the Newton step [i.e., the residual no negative second derivatives100 ). The minimum is
in the iterative solution of Eq. (3)]. found on the polynomial, the gradient is interpolated
   to the minimum, and the interpolated gradient is used
x∗ = ci xi g∗ = ci g(xi ) ci = 1 (14) for the next quasi-Newton step. If there is a large
i i i
extrapolation to the minimum, it is safer to explic-
∗ 2
Minimize |x new
− x | with respect to ci , itly search for the minimum by doing additional en-
ergy (and gradient) calculations to obtain a sufficient
where x new
= x∗ − H−1 g∗ (15)
reduction in the energy (and the magnitude of the
The GDIIS coefficients are determined by minimiz- gradient).
ing the residual and can be obtained by solving the
following:
     STEP SIZE CONTROL
A 1 c 0
= , The quadratic approximation to the potential energy
1 0 λ 1
surface is satisfactory only for a small local region,
where Ai j = (H−1 gi )T (H−1 g j ). (16) usually specified by a trust radius, τ . Steps outside this
region are risky and optimizations are more robust if
The GEDIIS method uses Aij = (gi − gj ) (xi −
95
the step size does not exceed τ . An initial estimate of τ
xj ). If A becomes singular, the oldest geometry is dis-
can be updated during the course of an optimization
carded and the solution is attempted again. The ap-
based on how well the potential energy surface can
proximate Hessian can be held constant, but better
be fit by a quadratic expression. A typical updating
performance is obtained if it is updated. If the mini-
recipe is as follows18 :
mization wanders into a region with a negative eigen-
value, it may converge to a TS. For a TS optimization, ρ = E/(gT x + 1/2 xT H0 x).
GDIIS sometimes converges not to the desired reac-
If ρ > 0.75 and 5/4 |x| > τ old ,
tion barrier but to the nearest saddle point. One way
to control GDIIS optimizations is to compare the pre- then τ new = 2 τ old .
dicted geometry with the one from a Newton step.94 If ρ < 0.25, then τ new = 1/4 |x|.
If the angle is too large, it is safer to take the New-
ton step (provided the Hessian has the correct num- Otherwise, τ new = τ old . (17)
ber of negative eigenvalues). In the final stages of a The simplest approach to step size control is to
minimization, GDIIS sometimes converges faster than scale the Newton step back if τ is exceeded. A better
quasi-Newton methods; a hybrid of quasi-Newton approach is to minimize the energy under the con-
and DIIS methods can be more efficient.95 straint that the step is not larger than τ .18,20 In the
trust radius method (TRM), this is done by using a
Lagrangian multiplier, λ, and corresponds to min-
LINE SEARCH imizing E(x)−1/2 λ (x2 −τ 2 ). With the usual
Newton, quasi-Newton, and GDIIS methods will con- quadratic approximation for E(x), this yields:
verge without a line search if the surface is quadratic. g0 + H0 x − λx = 0
However, life is not quadratic. For anharmonic sur-
or x = −(H0 − λI)−1 g0 (18)
faces, the Newton step (Eq. (3)) may be too large or
too small. Near an inflection point, Newton steps can where I is the identity matrix.
get into an infinite hysteresis loop. Conjugate gradient For minimizations, λ must be chosen so that all
methods need a line search because they generate only the eigenvalues of the shifted Hessian, H − λ I, are
a search direction but not a step length. Thus, a line positive [i.e., λ must be smaller (more negative) than
search is a good idea for any method, especially if it the lowest eigenvalue of H]. Thus, even if H has one or

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 795
Advanced Review wires.wiley.com/wcms

downhill in the remainder of the space. Because this


approach allows one to follow an eigenvector uphill
even if it does not have the lowest eigenvalue, it is
also known as eigenvector following.74,101–109 Alter-
natively, a separate λ can be chosen for each eigen-
vector such that the step varies from steepest descent
for large gradients to a Newton step with a shifted
Hessian for moderate gradients to a simple Newton
step for small gradients.2 For all of these approaches
to transition structure optimization, the Hessian must
have a suitable eigenvector that resembles the desired
reaction path. Instead of focusing on a single eigen-
vector, the reduced potential surface approach selects
a small subset of the coordinates for the transition
structure search and minimizes the energy in the re-
F I G U R E 2 | Plot of the displacement squared, maining space.110,111
|x|2 = |(H − λI )−1 g|2 , as a function of the Hessian shift The rational function optimization (RFO)
parameter λ (Reprinted with permission from Ref 102. Copyright 1983 method is a closely related approach for con-
ACS Publications.). The singularties occur at the eigenvalues of the trolling the step size for both minimization and
Hessian, hi . (a) For displacement to a minimum with a trust radius of transition structure searching that replaces the
τ 1 , shift parameter λ1 must be less than the lowest eigenvalue of the quadratic approximation by a rational function
Hessian, h1 . (b) For a displacement to a transition structure, the shift approximation.105
parameter must be between h1 and 12 h2 . For τ 1 , the lower of the two
solutions is chosen. For a smaller trust radius τ 2 , there is no solution; gT x + 1/2xT Hx
E(x) =
λ2 is chosen as the minimum in the curve [or λ2 = 12 (h1 + 12 h2 ) if λ2 1 + xT Sx
> 12 h2 ] and the displacement is scaled back to the trust radius (see   
H g x
Refs 101–104 for more details). [x 1] T
1 g 0 1
=    (19)
2 S 0 x
more negative eigenvalues, this approach can be used [x 1]
0 1 1
to take a controlled step downhill toward a minimum.
A plot of the square of step size as a function of λ is Minimizing E with respect to x by setting
shown in Figure 2(a) (the singularities occur when λ dE/dx = 0 leads to an eigenvalue equation involv-
is equal to one of the eigenvalues of H). ing the Hessian augmented by the gradient.
For TS optimization, the shifted Hessian must      
H g x S 0 x
have one negative eigenvalue (i.e., λ must be larger = 2 E (20)
gT 0 1 0 1 1
than the lowest eigenvalue of H but smaller than the
second lowest eigenvalue). As shown in Figure 2(b), if Typically, S is chosen as a constant times the
there are two solutions for a given step size, λ is usu- identity matrix [in this case the top row of Eq. (20) re-
ally chosen to be closer to the lowest eigenvalue; if duces to x = −(H0 − λI)−1 g0 , as in TRM; Eq. (18)].
there are no solutions for a given step size, λ is chosen The lowest eigenvalue and eigenvector of the aug-
as the minimum between the lowest and second low- mented Hessian are used for minimizations, whereas
est eigenvalue and the resulting step is scaled back to the second lowest eigenvalue and eigenvector are cho-
the τ .101–104 This procedure then allows a controlled sen for transition structure optimizations. The x
step to be taken toward the TS even when the Hes- obtained by the RFO approach is reduced if it is
sian does not have the correct number of negative larger than the τ .105,106,108,109 A partitioned RFO
eigenvalues. method may be better for saddle point searches if the
An alternative approach for TS optimization is Hessian does not have the correct number of neg-
to use the eigenvectors of the approximate Hessian to ative eigenvalues.105 In this case, one eigenvector is
divide the coordinates into two groups; the optimiza- chosen to be followed uphill to a maximum and the
tion then searches for a maximum along one eigen- remaining eigenvectors are chosen for minimization;
vector and for a minimum in the remaining space. the RFO method is used for optimization in both sub-
The step size in this approach can be controlled by spaces. It is possible to reach other transition struc-
using two Lagrangian multipliers, one for the step up- tures by following an eigenvector other than the one
hill along one eigenvector and the other for the step with the lowest eigenvalue.105–109

796 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization

CONSTRAINED OPTIMIZATION Pg0 + PH0 Px + α(I − P) = 0,



Under a variety of circumstances, it may be necessary P=I− ci ciT /|ci |2 , (24)
to apply constraints while optimizing the geometry i
(e.g., scanning potential energy surfaces, coordinate
where the ci are a set of orthogonal constraint vec-
driving, reaction path following, etc.). For nonre-
tors and α > 0. For redundant internal coordinates,
dundant coordinate systems and simple constraints,
the projector needs to remove the coordinate redun-
the coordinates being held constant can be easily re-
dancies as well as the constraint directions.58
moved from the space of variables being optimized.
For more general constraints and/or redundant inter-
nal coordinate systems, constraints can be applied by
penalty functions, projection methods, or Lagrangian SPECIAL CONSIDERATIONS FOR
multipliers. TRANSITION STRUCTURE
In the penalty function method, the constraints OPTIMIZATION
Ci (x)= 0 are imposed by adding an extra term,
Finding a minimum is comparatively easy because the
1/2 αi Ci (x)2 , to the energy in Eq. (1) and the
negative of the gradient always points downhill. By
energy is minimized as usual. Because the αi s need
contrast, a transition structure optimization must step
to have suitably large values so that the constraints
uphill in one direction and downhill in all other or-
are approximately satisfied at the minimum, the opti-
thogonal directions. Often, the uphill direction is not
mization may converge much slower than the corre-
known in advance and must be determined during the
sponding unconstrained optimization.
course of the optimization. As a result, numerous spe-
The preferred method for including con-
cial methods have been developed for transition struc-
straints in an optimization is by using Lagrangian
ture searching and many of them are closely related.
multipliers.
They can be loosely classified as single-ended and

L(x) = E(x) + λi Ci (x) (21) double-ended methods.2 Single-ended methods start
i
with an initial structure and displace it toward the
transition structure. These algorithms include quasi-
At convergence of the constrained optimization, Newton and related methods (as discussed above and
the derivative of the Lagrangian L(x) with respect in Quasi-Newton Methods for Transition Structures
to the coordinate x and the Lagrangian multiplier λi and Related Single-Ended Methods). Double-ended
must be zero. In the LQA, methods start from the reactants and products and
 ∂Ci (x) work from both sides to find the transition struc-
∂ L(x)
= g0 + H0 x + λi =0 ture and the reaction path. These include the nudged
∂x i
∂x elastic band (NEB) method, string method (SM), and
∂ L(x) growing string method (GSM; see Chain-of-States and
and = Ci (x) = 0. (22) Double-Ended Methods).
∂λi
The success of a given transition structure op-
Because the Lagrangian multipliers are opti- timization method depends on the topology of the
mized along with the geometric variables, this method surface as well as the starting structure(s), initial Hes-
generally converges much faster than the penalty func- sian(s), and coordinates. The two dimensional con-
tion method and the constraints are satisfied exactly. tour plots in Figure 3 illustrate some of the impor-
In the special case in which the constraint is linear tant features that can be found in higher dimensional
function of the displacements, cT x = c0 , the La- surfaces.112,113 If the change from reactant to product
grangian multiplier problem can be solved by using is dominated by a single coordinate, the potential en-
an augmented Hessian, ergy surface will have an approximately linear valley,
as shown in Figure 3(a). Hindered rotation about a
g0 + H0 x + λc = 0 and cT x = c0 single bond is an example of such an I-shaped sur-
     face. All methods should be able to find the transition
H c x −g0
or = (23) structure on this type of surface without difficulty.
cT 0 λ c0
Making one bond and breaking the other results in
Another way of applying linear constraints is an L- or V-shaped surface, as shown in Figure 3(b).
the projection method, in which a projector P is used Because the reaction path is strongly curved, finding
to remove the directions in which the displacements the transition structure can be a bit more difficult.
are constrained to be zero. A T-shaped surface involves a transition structure

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 797
Advanced Review wires.wiley.com/wcms

quadratic region. Many of the related methods differ


primarily in the techniques they use to get close to the
quadratic region. There are three main differences in
using quasi-Newton methods to optimize transition
structures compared with minima:

1. The Hessian update must allow for nega-


tive eigenvalues (Newton and Quasi-Newton
Methods). BFGS (Eq. (7)) yields a positive
definite update and is not appropriate. SR1
and PSB updates are suitable (Eqs. (6) and
(8)); the Bofill update74 (Eq. (9)) combines
SR1 and PSB yielding better results.
2. Line searches (Step Size Control) are gener-
ally not possible because the step toward the
transition structure may be uphill or down-
hill. If the coordinates can be partitioned a
priori into a subspace spanning the coordi-
nates involved in the transition vector and
the remaining subspace involving only coor-
dinates to be minimized, then a line search
can be used in the latter subspace.
F I G U R E 3 | Various classes of reaction channels near the 3. Controlling the step size and direction (see
transition structure on reactive potential energy surfaces: (a) I-shaped Step Size Control) are keys to the success of
valley, (b) L- or V-shaped valley, (c) T-shaped valley, and (d) H- or quasi-Newton methods for transition struc-
X-shaped valley. (Reprinted with permission from Ref 113. Copyright tures, especially if the initial structure is
2009 ACS Publications.) not in the quadratic region. For the TRM
(Eq. (18)), the shifted Hessian, H − λI, is
required to have one negative eigenvalue
in a hanging valley at the side of a main valley,
and λ must be chosen so that the step size
Figure 3(c). A method that follows the main valley
does not exceed τ . This is illustrated in
will miss the transition structure, but will reach the
Figure 2(b) and discussed in the section Step
transition structure if it starts from the other valley.
Size Control. For the partitioned rational
The H- or X-shaped surface in Figure 3(d) is the most
function optimization and eigenvector fol-
difficult because following the valley floor from either
lowing methods,105–109 Eqs. (19) and (20),
the reactant or the product side will miss the transi-
two different values of λ are used, one to
tion structure.
shift the eigenvalue for the maximization di-
Quasi-Newton and single-ended methods are
rection and the other to shift the eigenvalues
discussed first. These methods differ primarily in the
for minimization in the remaining directions.
way they try to get close to the quadratic region of
transition structure. Double-ended methods start with A quasi-Newton optimization of a transition
the reactants and products, and optimize the reaction structure requires an initial geometry and an initial
path as well as the transition structure. estimate of the Hessian. The initial geometry must be
somewhere near the quadratic region of the transition
Quasi-newton Methods for Transition structure. This can be challenging at times because
Structures our qualitative understanding of transition structure
Newton and quasi-Newton algorithms are the most geometries is not as well developed as for equilibrium
efficient single-ended methods for optimizing transi- geometries. There are some general rules-of-thumb
tion structures if the starting geometry is within the that may be useful. For example, the bonds that are
quadratic region of the transition structure. With suit- being made or broken in a reaction are, in many
able techniques for controlling the optimization steps cases, elongated by ca 50% in the transition struc-
such as TRM, RFO and eigenvector following (see ture. The Hammond’s postulate114 is useful for esti-
Step Size Control), these methods will also converge mating the position of the transition structure along
to a transition structure even if they start outside the the reaction path. It states that the transition structure

798 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization

is more reactant-like for an exothermic reaction and nian with a suitable guess for the interaction matrix
more product-like for an endothermic reaction. Or- element.120
bital symmetry/phase conservation rules115 may indi- If more information about the potential energy
cate when a reaction should occur via a non-least- surface is needed to find the quadratic region of the
motion pathway. The choice of coordinates for the transition structure, then one can calculate the energy
optimization is also very important. Often the union for a series of points along the linear synchronous
between the (redundant) internal coordinates of the transit121 (LST) path between reactants and products
reactants and products provides a good coordinate set to find a maximum along this approximate path (LST
for the transition structure. Sometimes extra dummy scan); internal or distance matrix coordinates are bet-
atoms may need to be added to avoid problems with ter than Cartesian coordinates for an LST scan. For
the coordinate system. more complex systems, two or more directions may
When a transition structure for an analogous re- need to be scanned to find the ridge separating re-
action is not available as an initial guess, then brack- actants and products. These scans could be carried
eting the search region for the transition structure op- out with or without optimizing the remaining coor-
timization can be useful. In the QST2 approach,116 a dinates (relaxed vs rigid surface scan). The reduced
structure on the reactant side and one on the product surface method110,111 selects a set of coordinates to
side are used to provide bounds on transition struc- span a two- or three-dimensional surface and mini-
ture geometry and to approximate the direction of mizes the energy with respect to all of the remaining
the reaction path through the transition structure. coordinates. This reduced dimensional surface can be
The QST3 input adds a third structure as an initial interpolated using distance-weighted interpolants and
estimate of the transition structure. A series of steps searched exhaustively for the lowest energy transition
is taken along the path connecting these two or three structures and reaction paths.
structures until a maximum is reached. Then, a full di-
mensional quasi-Newton optimization is carried out
to find the transition structure. Related Single-Ended Methods
Quasi-Newton methods require an initial esti- In principle, one should be able to start at the re-
mate of the Hessian that has a negative eigenvalue actants (or products) and go uphill to the transi-
with a corresponding eigenvector that is roughly tion structure. However, all directions from a mini-
parallel to the desired reaction path. The empir- mum are uphill and most will end at higher energy
ical rules for estimating Hessians for equilibrium transition structures or higher-order saddle points.
geometries78–80 do not provide suitable estimates of One approach is to choose a coordinate that will
the required negative eigenvalue and eigenvector of carry the molecule from reactant to product and step
the Hessian. Thus, quasi-Newton transition structure along this coordinate while minimizing all other di-
optimizations are typically started with an analytical rections. This is known as coordinate driving and
or numerical calculation of the full Hessian (or at least involves a series of constrained optimizations (see
numerical calculation of the most important compo- Constrained Optimization). Coordinate driving has
nents of the Hessian needed to obtain the transition its pitfalls.122–125 For example, if the dominant co-
vector). For QSTn methods,116 updating the Hessian ordinate changes along the reaction path, this ap-
during initial maximization steps along the reaction proach may not work (e.g., Figure 3(c) and (d)).
path is usually sufficient to produce a suitable nega- Alternatively, one can choose a direction and per-
tive eigenvalue and corresponding eigenvector. form a constrained optimization of the components
An empirical valence bond (EVB) model117 of of the gradient perpendicular to this direction. This is
the surface can be useful in obtaining a starting ge- known variously as line-then-plane,126 reduced gradi-
ometry and an initial Hessian for a quasi-Newton ent following,127–129 and Newton trajectories.130–132
transition structure optimization. A quadratic surface Growing string methods (GSM)131–136 are coordinate
is constructed around the reactant minimum and an- driving or reduced gradient following/Newton trajec-
other around the product minimum. These two sur- tory methods that start from both the reactant and
faces intersect along a seam. The lowest point along product side. Because GSMs are double-ended meth-
this seam is a good estimate of the transition structure ods closely related to chain-of-states methods like
geometry118,119 (provided that there are no additional NEB method and SM; they are discussed in the next
intermediates or transition structures along the reac- section.
tion path). An initial estimate of the Hessian can be An alternative to coordinate driving and re-
obtained from the interaction of the reactant surface duced gradient following is to step uphill along
and the product surface using a 2 × 2 EVB Hamilto- the lowest eigenvector of the Hessian—this amounts

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 799
Advanced Review wires.wiley.com/wcms

to the walking up valleys or eigenvector following


method described above.101–109 Stepping toward the
transition structure is handled by a quasi-Newton ap-
proach with TRM and RFO, Eqs. (18)–(20), to con-
trol the step.
The dimer method137–140 is a variant of the
eigenvector following approach that uses gradients
calculated at two closely spaced points to keep track
of the direction to search for a maximum. At each
step, the dimer is rotated to minimize the sum of the
two energies (this can be the more costly step in the
dimer method). The minimum in the dimer energy
occurs when the vector between the points is aligned
with the Hessian eigenvector with the lowest eigen-
value. The midpoint of the dimer is displaced uphill
toward the transition structure in a manner similar to
the eigenvector following method.
Another way to walk uphill along the shal-
lowest ascent path is to follow a gradient extremal
path.141–143 Although steepest descent reaction paths F I G U R E 4 | An example of a double-ended reaction path
can only be followed downhill, gradient extremal optimization on the Muller–Brown surface.154 The path optimization
paths are defined locally (the gradient is an eigenvec- with 11 points starts with the linear synchronous transit path (black);
tor of the Hessian) and can be followed uphill as well the first two iterations are shown in blue and green, respectively. The
path after 12 steps is in red and can be compared with the steepest
as downhill. A number of algorithms have been de-
descent path in light blue.
veloped for following gradient extremal paths.144–146
Gradient extremals do pass through minima, transi-
Given two points on opposite sides of the ridge
tion structures, and higher-order saddle points; how-
separating reactants from products, the ridge
ever, they have a tendency to wander about the sur-
method156,157 can also be used to optimize the tran-
face rather than follow the more direct route that
sition structure. A point on the ridge is obtained by
steepest descent paths take.146,147 This makes gradi-
finding a maximum along a line connecting a point
ent extremal path following less attractive for transi-
in the reactant valley and a point in the product val-
tion structure optimization.
ley. Two points on opposite sides of the ridge point
The image function or associated surface
are generated by taking small steps along this line.
approach148–150 converts a transition structure search
Then, a downhill step is taken from these two points
into a minimization on a modified surface that has
to generate two new points; the maximum along the
a minimum that coincides with the transition struc-
line between these new points is a lower energy ridge
ture of the original function. In a representation in
point. The process is repeated until a minimum is
which the Hessian is diagonal, this is accomplished
found along the ridge; this minimum is a transition
by choosing an eigenvector to follow and invert-
structure.
ing the sign of the corresponding component of the
gradient and Hessian eigenvalue. A TRM approach,
Eq. (18), is then used to search for the minimum on Chain-of-States and Double-Ended
the image function or associated surface. The transi- Methods
tion structure can also be optimized as a minimum Instead of optimizing a single point toward the transi-
on the gradient norm surface, |g(x)| (Refs 151–153); tion structure for a reaction, an alternative approach
however, the gradient norm can have additional min- is to optimize the entire reaction path connecting
ima at nonstationary points in which the gradient is reactants to products. Typically this is done by repre-
not zero. Because of the many minima and smaller senting the path by a series of points, that is, a chain-
radius of convergence, minimizing the gradient norm of-states. An example of a path optimization on the
is less practical for transition structure optimizations. Muller–Brown surface154 is shown in Figure 4. The
Bounds on the transition structure can be found various methods of this type differ primarily by how
by approaching it from both the reactant and product the points are generated, what function of the points is
side, providing progressively tighter limits.142,154,155 minimized, and what constraints are imposed to con-
This is closely related to the GSM131–136 (see below). trol the optimization. Many of the path optimization

800 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization


methods are based on minimizing the integral of giNEB = gi⊥ + g̃i (30)
the energy along the path, normalized by the path
length158 : The doubly NEB method164 includes a portion
 of the spring gradient perpendicular to the path (e.g.,
1
E path = E(x(s)) ds, (25) a second nudge).
L
    
where L is the total length of the path. Although this giDNEB = gi⊥ + g̃i + g̃i⊥ − g̃i⊥T gi⊥ gi⊥ /|gi⊥ |2 (31)
does not yield a steepest descent path,159 it provides a
good approximation to it. The elastic band or chain- This accelerates the optimization in the early stages,
of-states method replaces the integral by a discrete but may inhibit full convergence in the later stages.168
set of points, and the path is found by a constrained The original NEB method used a quenched or
minimization of the sum of the energies of the points damped Velocity Verlet method to propagate the
on the path from reactants to products, points toward the path.161–163 After each step, most
 of the velocity is removed. Alternatively, a quasi-
V path = E(xi ), (26) Newton method can be used for the constrained op-
where xi are the points on the path, which are required timization, preferably by moving all of the points
to remain equally spaced. Additional potentials are simultaneously.164,165,167,168 Because the dimension
sometimes introduced to prevent the path from kink- of the optimization problem (number of coordinate
ing or coiling up in minima.160 The number of points times the number of points) can be rather large, lim-
needed depends on the nature of the path (e.g., num- ited memory quasi-Newton methods such as L-BFGS
ber of intermediates and transition states, curvature and ABNR have been used.164,165,167,168 Currently,
of the path, etc.) and can range from less than 10 to the NEB/L-BFGS method is the most efficient and sta-
more than 50. Typically, several thousands of steps ble nudge elastic band method.168
are required for convergence, primarily because the The spring potential used to maintain uniform
motions of adjacent points are strongly coupled. The spacing in the NEB method introduces additional cou-
nudged elastic band (NEB) method161–170 and string pling between the points on the path that slows down
method (SM)171–176 are two current approaches that the convergence of the optimization. An early path
offer some improvement in the convergence of the optimization method avoided the spring potential by
path optimization. moving the points so that the gradient would lie on
In the NEB method,161–163 the points are kept the steepest descent path.173 The recently developed
equally spaced by adding a spring potential between string method171,172 also minimizes the perpendicular
the points (c.f. penalty function method, Constrained gradient, gi⊥ in Eq. (28), for the points on the path.
Optimization): A cubic spline is used to redistribute the points to
maintain equal spacing and the fourth-order Runge–
1  Kutta method is used to evolve the path.171,172 In
V spring = kspring (xi − xi−1 )2 . (27)
2 the quadratic string method, the points move on lo-
The gradient for a point has contributions from cal quadratic approximations to the surface using
the potential energy surface and from the spring po- updated Hessians.174,175 Alternatively, the L-BFGS
tential, which can be projected into components par- method can be used to propagate all of the points
allel and perpendicular to the path: at the same time.168 The efficiency of the best string
methods appears to be similar to the best NEB meth-
dV path     ods, requiring from hundreds to thousands of gradi-
gi = gi = giT τ i τ i gi⊥ = gi − gi , (28)
dxi ent calculations to optimize the path.168,176
dV spring   The GSM133–136 reduces the total effort by gen-
 
g̃i = g̃i = g̃iT τ i τ i g̃i⊥ = g̃i − g̃i , (29) erating the points one at a time starting from the
dxi reactants and products. This is similar to the line-
where τ i is the normalized tangent to the path at xi . then-plane method126 and reduced gradient follow-
The tangent can be calculated by central difference ing/Newton trajectories131,132 from both ends of the
(of xi−1 and xi+1 ) or by forward difference using the path. The GSM is also closely related to earlier meth-
point uphill from xi . The NEB method161–163 uses the ods that stepped from the reactant and product to-
gradient of the spring potential to displace (nudge) ward the transition structure.142,154,155 As the string
the points along the reaction path, and uses the gra- grows toward the center, the endpoints of the string
dient of the unmodified potential energy surface for provide better brackets for the transition structure.
directions perpendicular to the path. The search can be completed by interpolating to the

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 801
Advanced Review wires.wiley.com/wcms

maximum along the path and optimizing the transi- REACTION PATH FOLLOWING
tion structure.136 If the ends of the growing string are
For quasi-Newton and related methods that locate
too far apart, a NEB method or SM can be used to
only the transition structure, the steepest descent re-
optimize of the entire path. To reduce the cost, a low
action path can be obtained by following the gra-
level of theory can be used to grow the string and a
dient downhill. Chain-of-states approaches such as
higher level of theory can be used to refine the string
NEB method, SM, and GMS (see Chain-of-States and
or to optimize the transition structure.134,135
Double-Ended Methods) provide an approximation
Chain-of-state methods such as NEB, SM, and
to the reaction path as a part of the optimization.
GSM are still more costly than quasi-Newton meth-
Methods such as variational transition state theory177
ods for transition structures, and there is considerable
and reaction path Hamiltonian178 may require a more
room for improving the efficiency of these methods.
accurate path. For this case, rather than trying to
However, these methods do map out the entire re-
obtain very tight convergence with a chain-of-states
action path. This can be advantageous if the path
method, it is much more efficient to calculate an ac-
contains several transition states and intermediates.
curate path by a reaction path following method.
Furthermore, chain-of-states methods are more read-
In the steepest descent reaction path (SDP) or
ily parallelized than quasi-Newton and other single-
minimum energy path (MEP), x(s) is defined as:
ended methods. The development of these methods is
ongoing.
dx(s) −g(x(s))
= , (32)
ds |g(x(s))|
Characterization of an Optimized
Transition Structures where s is the arc length along the path. When mass-
Once a transition structure has been optimized, it weighted coordinates are used, the path is the IRC
is necessary to confirm that it is indeed a transi- of Fukui179 and corresponds to a classical particle
tion structure and is appropriate for the reaction un- moving with infinitesimal kinetic energy. A number
der consideration. A vibrational frequency calculation of reviews are available for reaction path following
needs to be carried out for the transition structure methods.8,180 In principle, the reaction path can be
to confirm that it has one and only one imaginary obtained by standard methods for numerical integra-
frequency (equivalently, the full dimensional Hessian tion of ordinary differential equations.181 However,
must have one and only one negative eigenvalue). It in some regions of the potential energy surface, these
is not sufficient to examine the updated Hessian used methods produce paths that zigzag across the valley
in the quasi-Newton optimization process because it unless very small step sizes are used. This is charac-
may be subject to numerical noise and the optimiza- teristic of stiff differential equations for which special
tion may not have explored the full coordinate space techniques such as implicit methods are needed.182
(e.g., modes that lower the symmetry of the molecule). The Ishida, Morokuma, Komornicki (IMK)
Firstly, an accurate Hessian must be calculated analyt- method183 uses an Euler step followed by a line search
ically or numerically and used for the vibrational fre- to step back toward the path. The LQA method
quency analysis. Secondly, the transition vector, that of McIver and Page,184,185 and subsequent improve-
is, the vibrational normal mode associated with the ments by others186–188 obtain a step along the path
imaginary frequency must be inspected to make sure by integrating Eq. (32) analytically on a quadratic
that the motion corresponds to the desired reaction. energy surface using a calculated or updated Hes-
This is most easily done with graphical software that sian. Because it makes use of quadratic information,
can animate vibrational modes. For complex reaction LQA can take large steps than IMK. The second-order
networks, reaction path following may be needed to method of Gonzalez and Schlegel189,190 (GS2) is an
verify that the transition structure connects the de- implicit trapezoid method that combines an explicit
sired reactants and products. Chain-of-states meth- Euler step with an implicit Euler step. The latter is ob-
ods yield good approximations to the reaction path tained by a constrained optimization. The larger step
as a part of the optimization. For quasi-Newton and size and greater stability of the GS2 method more
other methods that converge on a single structure, fol- than compensates for the few extra gradient calcula-
lowing the steepest descent path (see Reaction Path tions needed in the optimization. The implicit trape-
Following) will indicate the reactants and products zoid method has been generalized and extended by
connected by the transition structure. Reaction path Burger and Yang191,192 to a family of implicit–explicit
following may also reveal other stationary points on methods for accurate and efficient integration of re-
the path. action paths.

802 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization


The reaction path can be thought of as an av-
I= g(x(t))T g(x(t)) (dx(t)/dt)T (dx(t)/dt) dt
erage over a collection of classical trajectories or a
trajectory in a very viscous medium. The dynamic re- 
action path method193 and the damped Velocity Ver- = g(x(s))T g(x(s)) ds, (33)
let approach194 obtain a reaction path by calculat-
ing a classical trajectory in which almost all of the where x(t) is the reaction path parameterized by an
kinetic energy has been removed at every step. The arbitrary variable t, whereas x(s) is the reaction path
Hessian-based predictor–corrector method for reac- as a function of the arc length (see Eq. (32)). This is
tion paths195,196 adapts an algorithm used for classi- closely related to obtaining a classical trajectory by
cal trajectory calculations. An LQA predictor step is finding the path that minimizes the classical action,
taken on a local quadratic surface; a new Hessian is and may hold promise for future developments.
obtained at the end of the predictor step (by updat-
ing or by direct calculation); a corrector step is ob- SUMMARY
tained on a distance-weighted interpolant using the
local quadratics at the beginning and end of the pre- Methods for optimizing equilibrium geometries, lo-
dictor step. cating transition structures, and following reac-
A reaction path calculated at a lower level of tion paths have been outlined in this chapter.
theory can be used to provide an estimate of the transi- Quasi-Newton methods are very efficient for ge-
tion structure optimization at a higher level of theory ometry optimization when used with Hessian up-
when full optimization at the higher level is not prac- dating, approximate line searches, and trust radius
tical. The potential energy surface is more likely to or rational function techniques for step size con-
change along the path than perpendicular to the path trol. Single-ended and double-ended methods for
because reaction energies and bond making/breaking transition structure searches have been summarized.
coordinates are typically more sensitive to the ac- Procedures for approaching transitions structures are
curacy of the electronic structure calculation. In the described. Quasi-Newton methods with eigenvector
IRCMax method, single point high-level energy cal- following and rational function optimization are effi-
culations along the low-level path are interpolated to cient for finding transition structures. Double-ended
give an estimate of the transition structure and energy techniques for transition structures are discussed and
at the high level of theory.197 include nudged elastic band, string and growing string
Reaction paths can also be formulated as a vari- methods. The chapter concludes with a discussion of
ational method by finding the path that minimizes the validating transition structures and a description of
following integral159,198–201 : methods for following steepest descent paths.

REFERENCES
1. Hratchian HP, Schlegel HB. Finding minima, tran- 5. Schlegel HB. Geometry optimization. In: Schleyer
sition states, and following reaction pathways on PvR, Allinger NL, Kollman PA, Clark T, Schaefer
ab initio potential energy surfaces. In: Dykstra CE, HF III, Gasteiger J, Schreiner PR, eds. Encyclopedia
Frenking G, Kim KS, Scuseria GE, eds. Theory of Computational Chemistry. Vol 2. Chichester: John
and Applications of Computational Chemistry: the Wiley & Sons; 1998, 1136–1142.
First Forty Years. New York: Elsevier; 2005, 195– 6. Schlick T. Geometry optimization: 2. In: Schleyer
249. PvR, Allinger NL, Kollman PA, Clark T, Schaefer
2. Wales DJ. Energy Landscapes. Cambridge: Cam- HF III, Gasteiger J, Schreiner PR, eds. Encyclopedia
bridge University Press; 2003. of Computational Chemistry. Vol 2. Chichester: John
3. Schlegel HB. Exploring potential energy surfaces for Wiley & Sons; 1998: 1142–1157.
chemical reactions: an overview of some practical 7. Jensen F. Transition structure optimization tech-
methods. J Comput Chem 2003, 24:1514–1527. niques. In: Schleyer PvR, Allinger NL, Kollman PA,
4. Pulay P, Baker J. Optimization and reaction path al- Clark T, Schaefer HF III, Gasteiger J, Schreiner
gorithms. In: Spencer ND, Moore JH, eds. Encyclo- PR, eds. Encyclopedia of Computational Chemistry.
pedia of Chemical Physics and Physical Chemistry. Vol 5. Chichester: John Wiley & Sons; 1998: 3114–
Bristol: Institute of Physics; 2001, 2061–2085. 3123.

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 803
Advanced Review wires.wiley.com/wcms

8. Schlegel HB. Reaction path following. In: Schleyer 24. Koga N, Morokuma K. Determination of the lowest
PvR, Allinger NL, Kollman PA, Clark T, Schaefer HF energy point on the crossing seam between two po-
III, Gasteiger J, Schreiner PR, eds. Encyclopedia of tential surfaces using the energy gradient. Chem Phys
Computational Chemistry. Vol 4. Chichester: John Lett 1985, 119:371–374.
Wiley & Sons; 1998: 2432–2437. 25. Farazdel A, Dupuis M. On the determination of the
9. Schlegel HB. Geometry optimization on potential en- minimum on the crossing seam of two potential en-
ergy surfaces. In: Yarkony DR, ed. Modern Electronic ergy surfaces. J Comput Chem 1991, 12:276–282.
Structure Theory. Singapore: World Scientific Pub- 26. Ragazos IN, Robb MA, Bernardi F, Olivucci M.
lishing; 1995, 459–500. Optimization and characterization of the lowest
10. Liotard DA. Algorithmic tools in the study of semiem- energy point on a conical intersection using an
pirical potential surfaces. Int J Quantum Chem 1992, MCSCF Larangian. Chem Phys Lett 1992, 197:217–
44:723–741. 223.
11. Schlegel HB. Some practical suggestions for opti- 27. Manaa MR, Yarkony DR. On the intersection of two
mizing geometries and locating transition states. In: potential energy surfaces of the same symmetry. Sys-
Bertrán J, ed. New Theoretical Concepts for Under- tematic characterization using a Lagrange multiplier
standing Organic Reactions. Vol 267. The Nether- constrained procedure. J Chem Phys 1993, 99:5251–
lands: Kluwer Academic; 1989, 33–53. 5256.
12. Head JD, Weiner B, Zerner MC. A survey of 28. Bearpark MJ, Robb MA, Schlegel HB. A direct
optimization procedures for stable structures and method for the location of the lowest energy point on
transition-states. Int J Quantum Chem 1988, 33:177– a potential surface crossing. Chem Phys Lett 1994,
186. 223:269–274.
13. Schlegel HB. Optimization of equilibrium geome- 29. Anglada JM, Bofill JM. A reduced-restricted-quasi-
tries and transition structures. Adv Chem Phys 1987, Newton-Raphson method for locating and optimizing
67:249–286. energy crossing points between two potential energy
14. Bell S, Crighton JS. Locating transition-states. J Chem surfaces. J Comput Chem 1997, 18:992–1003.
Phys 1984, 80:2464–2475. 30. De Vico L, Olivucci M, Lindh R. New general tools
15. Jensen F. Introduction to Computational Chemistry. for constrained geometry optimizations. J Chem The-
Chichester; NY: John Wiley & Sons; 2007. ory Comput 2005, 1:1029–1037.
16. Cramer CJ. Essentials of Computational Chemistry: 31. Levine BG, Ko C, Quenneville J, Martinez TJ. Con-
Theories and Models. West Sussex, New York: John ical intersections and double excitations in time-
Wiley & Sons; 2004. dependent density functional theory. Mol Phys 2006,
17. Gill PE, Murray W, Wright MH. Practical Optimiza- 104:1039–1051.
tion. London, New York: Academic Press; 1981. 32. Keal TW, Koslowski A, Thiel W. Comparison of
18. Dennis JE, Schnabel RB. Numerical Methods for Un- algorithms for conical intersection optimisation us-
constrained Optimization and Nonlinear Equations. ing semiempirical methods. Theor Chem Acc 2007,
Englewood Cliffs, NJ: Prentice-Hall; 1983. 118:837–844.
19. Scales LE. Introduction to Non-Linear Optimization. 33. Bearpark MJ, Larkin SM, Vreven T. Searching for
New York: Springer-Verlag; 1985. conical intersections of potential energy surfaces with
20. Fletcher R. Practical Methods of Optimization. 2nd the ONIOM method: application to previtamin D.
ed. Chichester: John Wiley & Sons; 1987. J Phys Chem A 2008, 112:7286–7295.
21. Maeda S, Ohno K, Morokuma K. An auto- 34. Sicilia F, Blancafort L, Bearpark MJ, Robb MA. New
mated and systematic transition structure explorer algorithms for optimizing and linking conical inter-
in large flexible molecular systems based on com- section points. J Chem Theory Comp 2008, 4:257–
bined global reaction route mapping and microitera- 266.
tion methods. J Chem Theory Comput 2009, 5:2734– 35. Dick B. Gradient projection method for constraint op-
2743. timization and relaxed energy paths on conical inter-
22. Maeda S, Ohno K, Morokuma K. Automated global section spaces and potential energy surfaces. J Chem
mapping of minimal energy points on seams of cross- Theory Comput 2009, 5:116–125.
ing by the anharmonic downward distortion follow- 36. Madsen S, Jensen F. Locating seam minima for
ing method: a case study of H2CO. J Phys Chem A macromolecular systems. Theor Chem Acc 2009,
2009, 113:1704–1710. 123:477–485.
23. E W, Vanden-Eijnden E. Transition path theory and 37. Bakken V, Helgaker T. The efficient optimization of
path finding algorithms for the study of rare events. molecular geometries using redundant internal coor-
Annu Rev Phys Chem 2010, 61:391–420. dinates. J Chem Phys 2002, 117:9160–9174.

804 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization

38. Baker J. Techniques for geometry optimization: a 56. Baker J, Kinghorn D, Pulay P. Geometry optimiza-
comparison of cartesian and natural internal coor- tion in delocalized internal coordinates: an efficient
dinates. J Comput Chem 1993, 14:1085–1100. quadratically scaling algorithm for large molecules.
39. Baker J, Chan FR. The location of transition states: a J Chem Phys 1999, 110:4986–4991.
comparison of Cartesian, Z-matrix, and natural inter- 57. von Arnim M, Ahlrichs R. Geometry optimization in
nal coordinates. J Comput Chem 1996, 17:888–904. generalized natural internal coordinates. J Chem Phys
40. McQuarrie DA. Statistical Thermodynamics. Mill 1999, 111:9183–9190.
Valley, CA: University Science Books; 1973. 58. Peng CY, Ayala PY, Schlegel HB, Frisch MJ. Us-
41. Heidrich D. The Reaction Path in Chemistry: Current ing redundant internal coordinates to optimize equi-
Approaches and Perspectives. Dordrecht: Kluwer; librium geometries and transition states. J Comput
1995. Chem 1996, 17:49–56.
42. Haile JM. Molecular Dynamics Simulation: Elemen- 59. Farkas Ö, Schlegel HB. Methods for geometry opti-
tary Methods. New York: John Wiley & Sons; 1992. mization of large molecules. I. An O(N2 ) algorithm
43. Thompson DL. Trajectory simulations of molecu- for solving systems of linear equations for the trans-
lar collisions: classical treatment. In: Schleyer PvR, formation of coordinates and forces. J Chem Phys
Allinger NL, Kollman PA, Clark T, Schaefer HF III, 1998, 109:7100–7104.
Gasteiger J, Schreiner PR, eds. Encyclopedia of Com- 60. Farkas Ö, Schlegel HB. Geometry optimization meth-
putational Chemistry. Vol 5. Chichester: John Wiley ods for modeling large molecules. THEOCHEM
& Sons; 1998, 3056–3073. 2003, 666:31–39.
44. Hase WL, ed. Advances in Classical Trajectory Meth- 61. Baker J, Pulay P. Geometry optimization of atomic
ods. Greenwich: JAI Press; 1992. microclusters using inverse-power distance coordi-
45. Bunker DL. Classical trajectory methods. Methods nates. J Chem Phys 1996, 105:11100–11107.
Comput Phys 1971, 10:287. 62. Eckert F, Pulay P, Werner HJ. Ab initio geometry op-
46. Lasorne B, Worth GA, Robb MA. Excited-state dy- timization for large molecules. J Comput Chem 1997,
namics. WIREs Comput Mol Sci 2011, 1:460–475. 18:1473–1483.
47. Wilson EB, Decius JC, Cross PC. Molecular Vibra- 63. Baker J, Pulay P. Efficient geometry optimization of
tions: the Theory of Infrared and Raman Vibrational molecular clusters. J Comput Chem 2000, 21:69–76.
Spectra. New York: Dover Publications; 1980. 64. Billeter SR, Turner AJ, Thiel W. Linear scaling geome-
48. Kolda TG, Lewis RM, Torczon V. Optimization by try optimisation and transition state search in hybrid
direct search: new perspectives on some classical and delocalised internal coordinates. Phys Chem Chem
modern methods. SIAM Rev 2003, 45:385–482. Phys 2000, 2:2177–2186.
49. Garcia-Palomares UM, Rodriguez JF. New sequen- 65. Paizs B, Baker J, Suhai S, Pulay P. Geometry opti-
tial and parallel derivative-free algorithms for uncon- mization of large biomolecules in redundant internal
strained minimization. SIAM J Optim 2002, 13:79– coordinates. J Chem Phys 2000, 113:6566–6572.
96. 66. Kudin KN, Scuseria GE, Schlegel HB. A redundant
50. Lewis RM, Torczon V, Trosset MW. Direct search internal coordinate algorithm for optimization of pe-
methods: then and now. J Comput Appl Math 2000, riodic systems. J Chem Phys 2001, 114:2919–2923.
124:191–207. 67. Bucko T, Hafner J, Angyan JG. Geometry optimiza-
51. Han LX, Neumann M. Effect of dimensionality on the tion of periodic systems using internal coordinates.
Nelder–Mead simplex method. Optim Math Software J Chem Phys 2005, 122:124508.
2006, 21:1–16. 68. Doll K, Dovesi R, Orlando R. Analytical Hartree-
52. Hehre WJ, Radom L, Schleyer PvR, Pople JA. Ab Fock gradients with respect to the cell parameter:
Initio Molecular Orbital Theory. New York: John systems periodic in one and two dimensions. Theor
Wiley & Sons; 1986. Chem Acc 2006, 115:354–360.
53. Pulay P, Fogarasi G. Geometry optimization in re- 69. Murtagh BA, Sargent RWH. Computational expe-
dundant internal coordinates. J Chem Phys 1992, rience with quadratically convergent minimization
96:2856–2860. methods. Comput J 1972, 13:185–194.
54. Fogarasi G, Zhou XF, Taylor PW, Pulay P. The cal- 70. Broyden CG. The convergence of a class of dou-
culation of ab initio molecular geometries: efficient ble rank minimization algorithms. J Inst Math Appl
optimization by natural internal coordinates and em- 1970, 6:76–90.
pirical correction by offset forces. J Am Chem Soc 71. Fletcher R. A new approach to variable metric algo-
1992, 114:8191–8201. rithms. Comput J 1970, 13:317–322.
55. Baker J, Kessi A, Delley B. The generation and use 72. Goldfarb D. A family of variable metric methods
of delocalized internal coordinates in geometry opti- derived by variational means. Math Comput 1970,
mization. J Chem Phys 1996, 105:192–212. 24:23–26.

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 805
Advanced Review wires.wiley.com/wcms

73. Shanno DF. Conditioning of quasi-Newton methods 90. Fletcher R, Reeves CM. Function minimization by
for functional minimization. Math Comput 1970, conjugate gradients. Comput J 1964, 7:149–154.
24:647–657. 91. Polak E. Computational Methods in Optimiztion: a
74. Bofill JM. Updated Hessian matrix and the restricted Unified Approach. New York: Academic; 1971.
step method for locating transition structures. J Com- 92. Csaszar P, Pulay P. Geometry optimization by direct
put Chem 1994, 15:1–11. inversion in the iterative subspace. J Mol Struct 1984,
75. Farkas Ö, Schlegel HB. Methods for optimizing large 114:31–34.
molecules. II. Quadratic search. J Chem Phys 1999, 93. Farkas Ö. PhD (Csc) thesis. Budapest: Eötvös Loránd
111:10806–10814. University and Hungarian Academy of Sciences;
76. Bofill JM. Remarks on the updated Hessian ma- 1995.
trix methods. Int J Quantum Chem 2003, 94:324– 94. Farkas Ö, Schlegel HB. Methods for optimizing large
332. molecules. III. An improved algorithm for geome-
77. Liang W, Wang H, Hung J, Li XS, Frisch MJ. try optimization using direct inversion in the itera-
Eigenspace update for molecular geometry optimiza- tive subspace (GDIIS). Phys Chem Chem Phys 2002,
tion in nonredundant internal coordinates. J Chem 4:11–15.
Theor Comp 2010, 6:2034–2039. 95. Li XS, Frisch MJ. Energy-represented direct inversion
78. Schlegel HB. Estimating the Hessian for gradient- in the iterative subspace within a hybrid geometry
type geometry optimizations. Theor Chim Acta 1984, optimization method. J Chem Theory Comput 2006,
66:333–340. 2:835–839.
79. Wittbrodt JM, Schlegel HB. Estimating stretch- 96. Saad Y, Schultz MH. GMRES: a generalized mini-
ing force constants for geometry optimization. mal residual algorithm for solving nonsymmetric lin-
THEOCHEM 1997, 398:55–61. ear systems. SIAM J Sci Stat Comput 1986, 7:856–
80. Fischer TH, Almlof J. General methods for geometry 869.
and wave-function optimization. J Phys Chem 1992, 97. Krylov AN. On the numerical solution of the equa-
96:9768–9774. tion by which in technical questions frequencies of
81. Nocedal J. Updating quasi-Newton matrices with lim- small oscillations of material systems are determined.
ited storage. Math Comput 1980, 35:773–782. Izvestiya Akademii Nauk SSSR, Otdel Mat i Estest
82. Byrd RH, Lu PH, Nocedal J, Zhu CY. A limited mem- Nauk 1931, 7:491–539.
ory algorithm for bound constrained optimization. 98. Lanczos C. Solution of systems of linear euations
SIAM J Sci Comput 1995, 16:1190–1208. by minimized iterations. J Res Nat Bur Stand 1952,
83. Liu DC, Nocedal J. On the limited memory BFGS 49:33–53.
method for large-scale optimization. Math Program 99. Arnoldi WE. The principle of minimized iterations
1989, 45:503–528. in the solution of the matrix eigenvalue problem. Q
84. Biegler LT, Nocedal J, Schmid C. A reduced Hes- Appl Math 1951, 9:17–29.
sian method for large-scale constrained optimization. 100. Schlegel HB. Optimization of equilibrium geome-
SIAM J Optim 1995, 5:314–347. tries and transition structures. J Comput Chem 1982,
85. Morales JL, Nocedal J. Enriched methods for large- 3:214–218.
scale unconstrained optimization. Comput Optim 101. Cerjan CJ, Miller WH. On finding transition states.
Appl 2002, 21:143–154. J Chem Phys 1981, 75:2800–2806.
86. Fletcher R. An Optimal positivedefinite update for 102. Simons J, Jorgensen P, Taylor H, Ozment J. Walk-
sparse Hessian matrices. SIAM J Optim 1995, 5:192– ing on potential energy surfaces. J Phys Chem 1983,
218. 87:2745–2753.
87. Byrd RH, Nocedal J, Schnabel RB. Representations of 103. Nichols J, Taylor H, Schmidt P, Simons J. Walking on
quasi-Newton matrices and their use in limited mem- potential energy surfaces. J Chem Phys 1990, 92:340–
ory methods. Math Program 1994, 63:129–156. 346.
88. Anglada JM, Besalu E, Bofill JM. Remarks on 104. Simons J, Nichols J. Strategies for walking on poten-
large-scale matrix diagonalization using a Lagrange- tial energy surfaces using local quadratic approxima-
Newton-Raphson minimization in a subspace. Theor tions. Int J Quantum Chem 1990, 38(suppl 24):263–
Chem Acc 1999, 103:163–166. 276.
89. Prat-Resina X, Garcia-Viloca M, Monard G, 105. Banerjee A, Adams N, Simons J, Shepard R. Search
Gonzalez-Lafont A, Lluch JM, Bofill JM, Anglada for stationary points on surface. J Phys Chem 1985,
JM. The search for stationary points on a quan- 89:52–57.
tum mechanical/molecular mechanical potential en- 106. Baker J. An algorithm for the location of transition
ergy surface. Theor Chem Acc 2002, 107:147–153. states. J Comput Chem 1986, 7:385–395.

806 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization

107. Culot P, Dive G, Nguyen VH, Ghuysen JM. A quasi- state structure searching. THEOCHEM 1982, 6:365–
Newton algorithm for first-order saddle-point loca- 378.
tion. Theor Chim Acta 1992, 82:189–205. 124. Scharfenberg P. Theoretical analysis of constrained
108. Besalu E, Bofill JM. On the automatic restricted-step minimum energy paths. Chem Phys Lett 1981,
rational-function-optimization method. Theor Chem 79:115–117.
Acc 1998, 100:265–274. 125. Rothman MJ, Lohr LL. Analysis of an energy mini-
109. Anglada JM, Bofill JM. On the restricted step method mization method for locating transition states on po-
coupled with the augmented Hessian for the search tential energy hypersurfaces. Chem Phys Lett 1980,
of stationary points of any continuous function. Int J 70:405–409.
Quantum Chem 1997, 62:153–165. 126. Cardenas-Lailhacar C, Zerner MC. Searching for
110. Bofill JM, Anglada JM. Finding transition states using transition states: the line-then-plane (LTP) approach.
reduced potential energy surfaces. Theor Chem Acc Int J Quantum Chem 1995, 55:429–439.
2001, 105:463–472. 127. Quapp W, Hirsch M, Imig O, Heidrich D. Search-
111. Burger SK, Ayers PW. Dual grid methods for find- ing for saddle points of potential energy surfaces by
ing the reaction path on reduced potential energy following a reduced gradient. J Comput Chem 1998,
surfaces. J Chem Theory Comput 2010, 6:1490– 19:1087–1100.
1497. 128. Hirsch M, Quapp W. Improved RGF method to find
112. Shaik SS, Schlegel HB, Wolfe S. Theoretical Aspects saddle points. J Comput Chem 2002, 23:887–894.
of Physical Organic Chemistry: the SN2 Mechanism. 129. Crehuet R, Bofill JM, Anglada JM. A new look at the
New York: John Wiley & Sons; 1992. reduced-gradient-following path. Theor Chem Acc
113. Sonnenberg JL, Wong KF, Voth GA, Schlegel HB. 2002, 107:130–139.
Distributed Gaussian valence bond surface derived 130. Quapp W. Newton trajectories in the curvilinear
from ab initio calculations. J Chem Theory Comput metric of internal coordinates. J Math Chem 2004,
2009, 5:949–961. 36:365–379.
114. Hammond GS. A correlation of reaction rates. J Am 131. Quapp W. A growing string method for the reaction
Chem Soc 1955, 77:334–338. pathway defined by a Newton trajectory. J Chem Phys
115. Woodward RB, Hoffmann R. The Conservation 2005, 122:174106.
of Orbital Symmetry. Weinheim/Bergstr.: Verlag 132. Quapp W. The growing string method for flows
Chemie; 1970. of Newton trajectories by a second order method.
116. Peng CY, Schlegel HB. Combining synchronous tran- J Theor Comput Chem 2009, 8:101–117.
sit and quasi-Newton methods to find transition 133. Peters B, Heyden A, Bell AT, Chakraborty A. A grow-
states. Isr J Chem 1993, 33:449–454. ing string method for determining transition states:
117. Warshel A, Weiss RM. An empirical valence bond comparison to the nudged elastic band and string
approach for comparing reactions in solutions and in methods. J Chem Phys 2004, 120:7877–7886.
enzymes. J Am Chem Soc 1980, 102:6218–6226. 134. Goodrow A, Bell AT, Head-Gordon M. Development
118. Jensen F. Transition structure modeling by intersect- and application of a hybrid method involving inter-
ing potential energy surfaces. J Comput Chem 1994, polation and ab initio calculations for the determi-
15:1199–1216. nation of transition states. J Chem Phys 2008, 129:
119. Ruedenberg K, Sun JQ. A simple prediction of ap- 174109.
proximate transition states on potential energy sur- 135. Goodrow A, Bell AT, Head-Gordon M. Transition
faces. J Chem Phys 1994, 101:2168–2174. state-finding strategies for use with the growing string
120. Anglada JM, Besalu E, Bofill JM, Crehuet R. Predic- method. J Chem Phys 2009, 130:244108.
tion of approximate transition states by Bell–Evans– 136. Goodrow A, Bell AT, Head-Gordon M. A strategy for
Polanyi principle: I. J Comput Chem 1999, 20:1112– obtaining a more accurate transition state estimate
1129. using the growing string method. Chem Phys Lett
121. Halgren T, Lipscomb WN. The synchronous-transit 2010, 484:392–398.
method for determining reaction pathways and locat- 137. Henkelman G, Jonsson H. A dimer method for find-
ing molecular transition states. Chem Phys Lett 1977, ing saddle points on high dimensional potential sur-
49:225–232. faces using only first derivatives. J Chem Phys 1999,
122. Burkert U, Allinger NL. Pitfalls in the use of the tor- 111:7010–7022.
sion angle driving method for the calculation of con- 138. Kaestner J, Sherwood P. Superlinearly converging
formational interconversions. J Comput Chem 1982, dimer method for transition state search. J Chem Phys
3:40–46. 2008, 128:014106.
123. Williams IH, Maggiora GM. Use and abuse of 139. Shang C, Liu ZP. Constrained Broyden minimiza-
the distinguished coordinate method for transition tion combined with the dimer method for locating

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 807
Advanced Review wires.wiley.com/wcms

transition state of complex reactions. J Chem Theory 156. Ionova IV, Carter EA. Ridge method for finding
Comput 2010, 6:1136–1144. saddle-points on potential energy surfaces. J Chem
140. Heyden A, Bell AT, Keil FJ. Efficient methods for Phys 1993, 98:6377–6386.
finding transition states in chemical reactions: com- 157. Ionova IV, Carter EA. Direct inversion in the it-
parison of improved dimer method and partitioned erative subspace-induced acceleration of the ridge
rational function optimization method. J Chem Phys method for finding transition states. J Chem Phys
2005, 123:224101. 1995, 103:5437–5441.
141. Pancir J. Calculation of the least energy path on the 158. Elber R, Karplus M. A method for determining re-
energy hypersurface. Collect Czech Chem Commun action paths in large molecules: application to myo-
1975, 40:1112–1118. globin. Chem Phys Lett 1987, 139:375–380.
142. Basilevsky MV, Shamov AG. The local definition of 159. Quapp W. Chemical reaction paths and calculus of
the optimum ascent path on a multidimensional po- variations. Theor Chem Acc 2008, 121:227–237.
tential energy surface and its practical application 160. Choi C, Elber R. Reaction path study of helix for-
for the location of saddle points. Chem Phys 1981, mation in tetrapeptides: effect of side chains. J Chem
60:347–358. Phys 1991, 94:751–760.
143. Hoffman DK, Nord RS, Ruedenberg K. Gradient ex- 161. Jónsson H, Mills G, Jacobsen W. Nudged elastic band
tremals. Theor Chim Acta 1986, 69:265–279. method for finding minimum energy paths of transi-
144. Jorgensen P, Jensen HJA, Helgaker T. A gradient ex- tions. In: Berne BJ, Cicotti G, Coker DF, eds. Classical
tremal walking algorithm. Theor Chim Acta 1988, and Quantum Dynamics in Condensed Phase Simu-
73:55–65. lations. Singapore: World Scientific; 1998, 385.
145. Schlegel HB. Following gradient extremal paths. 162. Henkelman G, Jonsson H. Improved tangent estimate
Theor Chim Acta 1992, 83:15–20. in the nudged elastic band method for finding mini-
146. Sun JQ, Ruedenberg K. Gradient extremals and steep- mum energy paths and saddle points. J Chem Phys
est descent lines on potential energy surfaces. J Chem 2000, 113:9978–9985.
Phys 1993, 98:9707–9714. 163. Henkelman G, Uberuaga BP, Jonsson H. A climbing
147. Bondensgård K, Jensen F. Gradient extremal bifur- image nudged elastic band method for finding sad-
cation and turning points: an application to the dle points and minimum energy paths. J Chem Phys
H2 CO potential energy surface. J Chem Phys 1996, 2000, 113:9901–9904.
104:8025–8031. 164. Trygubenko SA, Wales DJ. A doubly nudged elastic
148. Smith CM. How to find a saddle point. Int J Quantum band method for finding transition states. J Chem
Chem 1990, 37:773–783. Phys 2004, 120:2082–2094.
149. Helgaker T. Transition state optimizations by trust- 165. Chu JW, Trout BL, Brooks BR. A super-linear mini-
region image minimization. Chem Phys Lett 1991, mization scheme for the nudged elastic band method.
182:503–510. J Chem Phys 2003, 119:12708–12717.
150. Sun JQ, Ruedenberg K. Locating transition states by 166. Maragakis P, Andreev SA, Brumer Y, Reichman DR,
quadratic image gradient descent on potential energy Kaxiras E. Adaptive nudged elastic band approach
surfaces. J Chem Phys 1994, 101:2157–2167. for transition state calculation. J Chem Phys 2002,
151. McIver JW, Komornicki A. Structure of transition 117:4651–4658.
states in organic reactions. General theory and an ap- 167. Galvan IF, Field MJ. Improving the efficiency of the
plication to the cyclobutene-butadiene isomerization NEB reaction path finding algorithm. J Comput Chem
using a semiempirical molecular orbital method. J Am 2008, 29:139–143.
Chem Soc 1972, 94:2625–2633. 168. Sheppard D, Terrell R, Henkelman G. Optimization
152. Poppinger D. Calculation of transition states. Chem methods for finding minimum energy paths. J Chem
Phys Lett 1975, 35:550–554. Phys 2008, 128:134106.
153. Komornicki A, Ishida K, Morokuma K, Ditchfield R, 169. Alfonso DR, Jordan KD. A flexible nudged elastic
Conrad M. Efficient determination and characteriza- band program for optimization of minimum energy
tion of transition states using ab initio methods. Chem pathways using ab initio electronic structure methods.
Phys Lett 1977, 45:595–602. J Comput Chem 2003, 24:990–996.
154. Mueller K, Brown LD. Location of saddle points and 170. Gonzalez-Garcia N, Pu JZ, Gonzalez-Lafont A, Lluch
minimum energy paths by a constrained simplex opti- JM, Truhlar DG. Searching for saddle points by using
mization procedure. Theor Chim Acta 1979, 53:75– the nudged elastic band method: an implementation
93. for gas-phase systems. J Chem Theory Comput 2006,
155. Dewar MJS, Healy EF, Stewart JJP. Location of tran- 2:895–904.
sition states in reaction mechanisms. J Chem Soc, 171. E W, Ren WQ, Vanden-Eijnden E. String method for
Faraday Trans II 1984, 80:227–233. the study of rare events. Phys Rev B 2002, 66:052301.

808 
c 2011 John Wiley & Sons, Ltd. Volume 1, September/October 2011
WIREs Computational Molecular Science Geometry optimization

172. E W, Ren WQ, Vanden-Eijnden E. Simplified and 187. Sun JQ, Ruedenberg K. Quadratic steepest descent
improved string method for computing the minimum on potential energy surfaces. 2. Reaction path follow-
energy paths in barrier-crossing events. J Chem Phys ing without analytical Hessians. J Chem Phys 1993,
2007, 126:164103. 99:5269–5275.
173. Ayala PY, Schlegel HB. A combined method for de- 188. Eckert F, Werner HJ. Reaction path following by
termining reaction paths, minima, and transition state quadratic steepest descent. Theor Chem Acc 1998,
geometries. J Chem Phys 1997, 107:375–384. 100:21–30.
174. Burger SK, Yang WT. Quadratic string method 189. Gonzalez C, Schlegel HB. An improved algorithm for
for determining the minimum-energy path based reaction path following. J Chem Phys 1989, 90:2154–
on multiobjective optimization. J Chem Phys 2006, 2161.
124:054109. 190. Gonzalez C, Schlegel HB. Reaction path following
175. Burger SK, Yang W. Sequential quadratic program- in mass-weighted internal coordinates. J Phys Chem
ming method for determining the minimum energy 1990, 94:5523–5527.
path. J Chem Phys 2007, 127:164107. 191. Burger SK, Yang WT. Automatic integration of the
176. Koslover EF, Wales DJ. Comparison of double-ended reaction path using diagonally implicit Runge-Kutta
transition state search methods. J Chem Phys 2007, methods. J Chem Phys 2006, 125:244108.
127:134102. 192. Burger SK, Yang WT. A combined explicit-implicit
177. Truhlar DG, Garrett BC. Variational transition method for high accuracy reaction path integration.
state theory. Annu Rev Phys Chem 1984, 35:159– J Chem Phys 2006, 124:224102.
189. 193. Stewart JJP, Davis LP, Burggraf LW. Semiempirical
178. Miller WH, Handy NC, Adams JE. Reaction path calculations of molecular trajectories: method and ap-
Hamiltonian for polyatomic molecules. J Chem Phys plications to some simple molecular systems. J Com-
1980, 72:99–112. put Chem 1987, 8:1117–1123.
179. Fukui K. The path of chemical reactions—the IRC 194. Hratchian HP, Schlegel HB. Following reaction path-
approach. Acc Chem Res 1981, 14:363–368. ways using a damped classical trajectory algorithm.
180. McKee ML, Page M. Computing reaction pathways J Phys Chem A 2002, 106:165–169.
on molecular potential energy surfaces. In: Lipkowitz 195. Hratchian HP, Schlegel HB. Accurate reaction paths
KB, Boyd DB, eds. Reviews in Computational Chem- using a Hessian based predictor-corrector integrator.
istry. Vol 4. New York: VCH; 1993, 35–65. J Chem Phys 2004, 120:9918–9924.
181. Garrett BC, Redmon MJ, Steckler R, Truhlar DG, 196. Hratchian HP, Schlegel HB. Using Hessian updating
Baldridge KK, Bartol D, Schidt MW, Gordon MS. to increase the efficiency of a Hessian based predictor-
Algorithms and accuracy requirements for comput- corrector reaction path following method. J Chem
ing reaction paths by the method of steepest descent. Theory Comput 2005, 1:61–69.
J Phys Chem 1988, 92:1476–1488. 197. Malick DK, Petersson GA, Montgomery JA. Tran-
182. Gear CW. Numerical Initial Value Problems in Or- sition states for chemical reactions. I. Geometry
dinary Differential Equations. Englewood Cliffs, NJ: and classical barrier height. J Chem Phys 1998,
Prentice-Hall; 1971. 108:5704–5713.
183. Ishida K, Morokuma K, Komornicki A. The intrinsic 198. Olender R, Elber R. Yet another look at the
reaction coordinate. An ab initio calculation for HNC steepest descent path. THEOCHEM 1997, 398:63–
→ HCN and H− + CH4 → CH4 + H− . J Chem Phys 71.
1977, 66:2153–2156. 199. Vanden-Eijnden E, Heymann M. The geometric min-
184. Page M, Doubleday C, McIver JW. Following steep- imum action method for computing minimum energy
est descent reaction paths. The use of higher energy paths. J Chem Phys 2008, 128:061103.
derivatives with ab initio electronic structure meth- 200. Aguilar-Mogas A, Crehuet R, Gimenez X, Bofill JM.
ods. J Chem Phys 1990, 93:5634–5642. Applications of analytic and geometry concepts of the
185. Page M, McIver JM. On evaluating the reaction path theory of Calculus of Variations to the Intrinsic Reac-
Hamiltonian. J Chem Phys 1988, 88:922–935. tion Coordinate model. Mol Phys 2007, 105:2475–
186. Sun JQ, Ruedenberg K. Quadratic steepest descent 2492.
on potential energy surfaces. 1. Basic formalism and 201. Aguilar-Mogas A, Gimenez X, Bofill JM. Finding re-
quantitative assessment. J Chem Phys 1993, 99:5257– action paths using the potential energy as reaction
5268. coordinate. J Chem Phys 2008, 128:104102.

Volume 1, September/October 2011 


c 2011 John Wiley & Sons, Ltd. 809

Vous aimerez peut-être aussi