Vous êtes sur la page 1sur 38

Truncated Newton-Raphson Methods for Quasicontinuum Simulations

by Yu Liang, Ramdev Kanapady, and Peter W. Chung

ARL-TR-3791

May 2006

Approved for public release; distribution is unlimited.

NOTICES Disclaimers The findings in this report are not to be construed as an official Department of the Army position unless so designated by other authorized documents. Citation of manufacturer’s or trade names does not constitute an official endorsement or approval of the use thereof. Destroy this report when it is no longer needed. Do not return it to the originator.

Army Research Laboratory
Aberdeen Proving Ground, MD 21005-5067

ARL-TR-3791

May 2006

Truncated Newton-Raphson Methods for Quasicontinuum Simulations
Yu Liang and Ramdev Kanapady
University of Minnesota

Peter W. Chung
Computational and Information Sciences Directorate, ARL

Approved for public release; distribution is unlimited.

TITLE AND SUBTITLE Final 2003–2005 5a. solver 16. SUPPLEMENTARY NOTES *Department of Mechanical Engineering. ABSTRACT The quasicontinuum method provides an efficient way to simulate the mechanical response of relatively large crystalline materials at zero temperature by combining continuum and atomistic approaches. Unconstrained optimization constitutes the key computational kernel of this method. Arlington. TASK NUMBER 5f. Washington Headquarters Services. REPORT TYPE 3. TELEPHONE NUMBER (Include area code) UNCLASSIFIED UNCLASSIFIED UNCLASSIFIED UL 38 410-278-6027 Standard Form 298 (Rev. NAME OF RESPONSIBLE PERSON Peter W. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response. Respondents should be aware that notwithstanding any other provision of law.com. 1. including suggestions for reducing the burden. SPONSOR/MONITOR'S ACRONYM(S) 11. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. REPORT DATE (DD-MM-YYYY) 2. WORK UNIT NUMBER 7. GRANT NUMBER 5c. nanoindentation. searching existing data sources.S. 13.18 ii .REPORT DOCUMENTATION PAGE Form Approved OMB No. MD 21005-5067 9. preconditioned conjugate gradient. and completing and reviewing the collection information. PERFORMING ORGANIZATION REPORT NUMBER ARL-TR-3791 10. Chung 4UH7AC 5e. SPONSOR/MONITOR'S REPORT NUMBER(S) 12. including the time for reviewing instructions. REPORT b. unconstrained optimization. Results of illustrative examples mainly focus on the number of minimization iterations to converge and CPU time for the two-dimensional nanoindentation and shearing grain boundary problems. Send comments regarding this burden estimate or any other aspect of this collection of information. Suite 1204. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release. ABSTRACT c. MN 55455 14. 8/98) Prescribed by ANSI Std.To) May 2006 4. gathering and maintaining the data needed. nonlinear conjugate gradient. LIMITATION OF ABSTRACT 18. 15. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) U. The efficiency of the techniques for minimization depends on both the time needed to evaluate the energy expression and the number of iterations needed to converge to the minimum. Chung 19b. In this research. University of Minnesota. Army Research Laboratory ATTN: AMSRD-ARL-CI-HC Aberdeen Proving Ground. Newton-Raphson method.* Ramdev Kanapady. PROJECT NUMBER Yu Liang. no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. PROGRAM ELEMENT NUMBER 6.* and Peter W. Directorate for Information Operations and Reports (0704-0188). NUMBER OF PAGES 19a. DATES COVERED (From . SECURITY CLASSIFICATION OF: a. THIS PAGE 17. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 8. AUTHOR(S) 5d. SUBJECT TERMS quasicontinuum. 1215 Jefferson Davis Highway. distribution is unlimited. VA 22202-4302. we report the effectiveness of the truncated Newton-Raphson method and quasi-Newton method with low-rank Hessian update strategy that are evaluated against the full Newton-Raphson and preconditioned nonlinear conjugate gradient implementation available at qcmethod. CONTRACT NUMBER Truncated Newton-Raphson Methods for Quasicontinuum Simulations DAAD19-01-2-0014 5b. Z39. to Department of Defense. Minneapolis.

.............12 3....................................... Solvers for Unconstrained Optimization 3..............................................6 Formulation of Gradient and Hessian .................................................5 Coupling Atomistic and Continuum...............1 Nonlinear Conjugate Gradient Method ..........................................1 4.........2 4........................................................................................................7 2...............1...2................................................1 2..............................16 Simulation of Nanoindentation ............ Numerical Experiments and Results 4..........................................................................8 9 3.................................................................................5.....................................................................4 Continuum Framework.......................1 Line Search Algorithm ....................................11 3...................2 Low-Rank Update Method...........10 NR Methods .............................................................................................2..........10 3..............2 Backtracking......18 20 5............................................ Conclusions iii ...5 v v vi 1 2 Energy Framework ...3 2...............9 3.......................................2.........2 Shearing Grain Boundary............12 3...............2...................... Introduction Quasicontinuum Method 2..........14 15 3..1.......3 Convergence of the NR Method Based on Iterative Solver ......................5...........1 Analytic Formulation of Gradient .........................................................2 2..............................4 2.....................................................2 Analytic Formulation of Hessian ........................ 2...................7 2......................................................................................4 Implementation of the NR Method .....2 Atomistic Framework........................................11 3.............1 Numerical Formulation of Hessian ..........................Contents List of Figures List of Tables Acknowledgments 1...........................

6. References 22 25 28 Appendix. Supporting Derivations Distribution List iv .

.............................. Approximate Hessian computed from previous iteration gradient information.....12 Figure 4.. NR–PCG = TNR.................... the NR. (b) stage 1 of 1st load step............................... and (e) 4th load step......................................................4 Figure 3............ CPU time for NCG.... ..................................... Rank-2 update approximation of Hessian inverse............ (3) stage 2 of 1st load step.......16 Figure 5... and (d) 30th load step............ gray-filled circles are atoms selected as elemental nodes in local region........ and (6) 5th load step.. (2) stage 1 of 1st load step....... Nanoindentation problem: (a) 15th load step...........................17 Figure 7...... NR–PCG = TNR.........17 Figure 6...........17 Figure 8..List of Figures Figure 1..................................... (b) 4th load step............. (b) 20th load step....................... NR.......... (c) 7th load step........... Performance comparison of the NCG............................................. CB rule which transforms the reference lattice into homogeneously deformed (unified deformation gradient tensor F) lattice........................................... NR....................................... Convergence by nonlinear iteration for nanoindentation: (a) initial load step....... ....................13 v .......................... ............. (a) Iterative process of quasicontinuum formulation and (b) local and nonlocal framework: black-filled circles are interface atoms shared by local and nonlocal regions............................ ... (d) 3rd load step............................... Time cost per minimizing iterations for TNR and NCG and number of iteration of linear conjugate gradient for the TNR for shear grain boundary problem............................ Convergence by timing cost (s) for nanoindentation: (1) initial load step... .. . Numerical methods to formulate Hessian matrix....... and QNR based rank-2 update Hessian for 10th load step of nanoindentation problem........................... (4) stage 1 of 10th load step... and TNR for shear grain boundary problem.........3 Figure 2.20 List of Tables Table 1.......... and (d) 11th load step........ Number of minimization iterations for NCG....... ................. .........13 Table 2.. Sliding of grain boundary: (a) 1st load step.19 Figure 11......................... white-filled circles in the local region are formulated using kinematic constraints... ... (c) 2nd load step.......18 Figure 9............ (c) 25th load step...... (5) stage 2 of 10th load step. and TNR for shear grain boundary problem for 10th load step..............................19 Figure 10....

DAAD19-01-20014. MD. Army Research Laboratory (ARL). is also gratefully acknowledged. Special thanks are due for additional support to the Computational and Information Sciences Directorate’s High Performance Computing Division at ARL. Minneapolis. MN. Army High Performance Computing Research Center (AHPCRC) under the auspices of the U. Other related support in the form of computer grants from the Minnesota Supercomputer Institute (MSI). vi .S.Acknowledgments Yu Liang and Ramdev Kanapady are very pleased to acknowledge support in part by the U.S. contract no. Aberdeen Proving Ground.

and Newton-Raphson (NR) methods. The fully nonlocal QC method with variable clusters (12) will be employed in a companion report for three-dimensional (3-D) situations. nonlinear conjugate gradient (NCG). in each iteration. 14. Due to the implicit solution nature of the QC method and the predominant nonconvex nature of the potential energy surface with vast numbers of metastable configurations. problems of interest are those involving a small number of defects such as dislocations. Alternatively. NCG methods (8.. or grain boundary interactions (7). Three popularly used unconstrained optimizing methods are steepest descent (SD). Hessian matrix). Compared to the NCG method. Unconstrained optimization constitutes the key computational kernel of this method (8–11). the Hessian is not computed from previous gradient information. In this report. Second. First. First. In this research. 15) and the NR method (16) converge considerably faster. thereby avoiding a Hessian calculation. the NR method converges even faster as it employs the curvature of the objective function. Typically. in addition to the gradient information. the search direction is obtained by solving the Newton equations. Introduction The quasicontinuum (QC) method is commonly employed to solve a wide variety of multiscale problems (1–6). 1 . an iterative method is employed for the solution of the line search direction. The TNR method differs from the QNR method in two aspects. Secondly. The basic idea behind the QNR method follows from the NCG method by employing the gradients information of previous minimization iterations within the Newton framework. variants of the Newton method are available such as the quasi-Newton (QNR) and the truncated Newton methods (TNR). to predict a search direction during the minimization iterations. a truncated strategy is used such that only an approximate solution to the Newton equations (16–25) is provided. The efficiency of the techniques depends on the time needed to evaluate the energy expression and the number of iterations needed to converge to the minimum. In TNR. In this context. the NR method suffers from two drawbacks. Iterations should not be confused with function evaluations nor the iterations involved with the linear equations solvers within the minimization framework.1. Compared to the SD method (13). a crack. due to the use of the second derivative of an objective function (i.e. which have computational complexity when employing factorization-based direct solvers. the Hessian is a dense matrix where the storage complexity is . the objective is to study minimization solvers in order to optimize computational performance. However. iterative solution techniques are explored for the local/nonlocal variant of the QC method (4) for the two-dimensional (2-D) situations. a minimization iteration is defined as an iteration when the direction vector is updated. one needs to use an effective and efficient iterative minimization solution technique.

The ground state (or state of static equilibrium) at zero temperature is found by minimizing the total potential energy or.5. As this report is to be self-contained. In this section.com for 2-D situations. Through these solvers. Quasicontinuum Method The two key components required for the minimization process are the gradient and Hessian computations. the gradient and Hessian specific to this function is presented. the minimum can be sought by following a combination of the local direction of the topology (i. equivalently. In section 2. Concluding remarks in section 5 follow. 2. we review the basic equations and concepts related to the local/nonlocal variant of the QC formulation for the calculation of gradients and Hessians. The gradient and Hessian are directly determined from the respective first and second gradients of an energy function. the local/nonlocal QC formulation is overviewed as it provides the energy function for our work. 2 (1) . the objective is to determine the stable equilibrium configurations of a deforming crystalline material. only those elements of the QC formulation vital to our discussion are presented.Among these four minimization methods. finding the zero out-of-balance force values of the following: .1 Energy Framework Figure 1a shows the computational framework of a typical QC simulation. We report the effectiveness of the TNR method with the conjugate gradient method for truncating the search direction and QNR method with low-rank Hessian update strategies used for computing the Hessian that are evaluated against the NR and the NCG methods. In section 2. the gradient and its local curvature. In section 4. the minima are identified as the lowest local point. a Lagrangian description is used to formulate the kinematic behavior of the crystal. 2. the four iterative solution techniques considered in this report are overviewed. both the load-increment and resultant deformation are considered finite but small. In section 3. For further details.e.. Results of illustrative examples mainly focus on the number of iterations and CPU time for the 2-D nanoindentation and shearing grain boundary problems. The remainder of this report is organized as follows. in a linear strategy. In each load step. Therefore. the NCG and the NR methods are currently implemented for the local/nonlocal variant of the QC method found at qcmethod. the Hessian). readers are referred to Shames and Dym (4). Over an energy landscape. the result findings of the TNR method and QNR with low-rank Hessian update strategy are evaluated against the NR and the NCG implementation. The heart of the QC method is the formulation of the total potential energy as an ensemble of a function whose independent variables are atom and finite-element (FE) nodal coordinates. When the state of the system is not co-located with a minimum. In this research. we examine the performance differences for two additional solver methods.

gray-filled circles are atoms selected as elemental nodes in local region. . interface nodes. while is derived from by using kinematic constraints. Figure 1b further illustrates that within the nonlocal (atomistic) region. Here. An adaptive partition strategy is required to obtain an accurate description of the ensemble. . are labels for the local region. where the displacement of the representative atoms is incurred by the incremental external force ( ). (2) where E is the strain energy of the crystal.Figure 1. only a small number of atoms (representative -) are selected to describe the local material properties used in the FE method. (a) Iterative process of quasicontinuum formulation and (b) local and nonlocal framework: black-filled circles are interface atoms shared by local and nonlocal regions. Thus. . respectively. and are the associated degrees of freedom of the regions . within the local (continuum) region. white-filled circles in the local region are formulated using kinematic constraints. 3 . nonlocal region. and . and constitute the different components of . respectively. . is computed at the element integration points and at the nodes. The first three. and . In the end. it should be remarked that the partition of the ensemble is not fixed. As illustrated in figure 1b. N is much smaller than the total number of degrees of freedom of the whole ensemble if the ensemble were comprised only of atoms. It is noted that and the transition area nodes provide a seamless coupling between local and nonlocal regions. the target strain energy function may be additively decomposed into local and nonlocal components as follows: (3) Here. The total potential energy can be further represented as: . a fully atomistic description is used to accurately depict the defect of the crystal. respectively. the crystal is divided into two regions—a local (continuum) region and a nonlocal (atomistic) region. and transition area nodes (6). In addition. and and are the reference and deformed atomic coordinates.

there is a one-to-one correspondence between nodes and atoms. . where there is a smaller number of nodes than atoms. By interpolating some of the atoms (i.e. is the site energy of as computed using the EAM potential: . F B3 FB3 FB2 B1 FB1 B2 undeformed lattice deformed lattice Figure 2. i.2. those atoms that are not nodes). In the nonlocal region.. the strain energy is a locally constrained homogeneously deformed system. (4) 4 . CB rule which transforms the reference lattice into homogeneously deformed (unified deformation gradient tensor F) lattice. (4) Here. which postulates that when a monatomic crystal is subjected to a small linear displacement of its boundary. then all atoms will follow this displacement (shown in figure 2).. In the local region.e. which takes the atomic coordinates as direct input (17).2 Atomistic Framework In this work. the energy over all atoms is determined by a kinematic approximation of atom deformation. Like other empirical atomistic models. the energy of the nonlocal region is identical to a fully atomistic region. (7) (5) . Therefore. where (6) and . the energy is formulated from the embedded atom method (EAM). can be approximated as the sum of the energies contributed from individual atomic sites (summation rule). This kinematic approximation is known as the Cauchy-Born (CB) rule.

the deformed relative position of atom i near atom is as follows: . The cutoff is introduced to limit the number of modeled pair-wise interactions. 2. The elemental deformation gradient tensor 5 . Here. Equation 12 is derived in accordance with the CB rule. Implied in equations 10 and 12 is that the present energy measures changes relative to the equilibrium lattice energy of the crystal such that the strain energy at zero deformation (Fe = I) in equilibrium is zero. i (8) If atom i is located in the transition or local region (shown in figure 2). a deformation gradient tensor of element e. M is the number of elements within the local region and is the strain energy density depending on . is the displacement of representative atom . and follows that is the FE shape function. Since a full atomistic strategy is employed to formulate the nonlocal region. and is the interatomic distance. we choose . where .The embedding energy is the energy associated with placing an atom in the electron environment described by local electron density . and is the radius of the cutoff sphere. which effectively truncates longer range interactions and accelerates the convergence of the sum. Assuming that there are neighboring atoms inside the cutoff sphere of atom .3 Continuum Framework The strain energy related to local region is computed by summing the potential energy of all elements within the local region as follows: . It (10) (9) . (12) (11) In these two formulae. where is the number of atoms represented by atom . u β is formulated using the following kinematic constraint: . The pair-potential describes the electrostatic contributions of core ions.

For seamless joining between atomic and finite-element method (FEM) regions. using a specific numerical integration method.4 Coupling Atomistic and Continuum is the total number of atoms inside The interface between local and nonlocal regions is nontrivial. the displacement field of the element is formulated by using the following relation: . According to the principle of the FE method (26. it follows from equation 11 that . and where element e. The link between atomistic and continuum material simulations requires a procedure for determining the nonlinear elastic continuum response of a material with a particular atomistic representation. the work involved is as follows: • • • • Construct deformation gradient for this element. 27). (15) where is the tensor product. Pick a representative atom in the element. and is the displacement of node k. For each element. the FEM region must have an internal atomic structure such that: (1) the assumption that strain in each element is homogeneous is reasonable.(13) describes the local and homogeneous stretching and rotating of element e. Calculate the energy per atom using the atomistic model. Assign energy to the element. . The continuum response should match the results produced by discrete lattice 6 . Consequently. (16) which is evaluated at the quadrature points. (2) elements have a scale. and (3) element corner nodes ( ) are positioned on one of the atoms. (14) where is the number of nodes per element. is the shape function related to elemental node k. Furthermore. 2.

. As a remedy. the deformation cannot be assumed to be uniform at scales approaching lattice dimensions. the out-of-balance nodal forces are computed by the following: 7 . the corresponding gradient and sometimes Hessian have to be computed. 2. even for infinitesimal deformations. In general.5. the energy function is rewritten as follows: . As a result. . where is the first Piola-Kirchhoff stress tensor. while allowing more efficient simulation at larger scales through the application of the nonlinear continuum theory (28). Following equations 10 and 16.simulations at near atomistic length scales. it follows from equation 18 that . (22) (21) (20) (19) (18) (17) where is the Kronecker Delta. Similarly. To this end. where is the element centroid. it is derived that . the interface between local and nonlocal atoms does not preserve force symmetry because atoms in local elements do not feel nearby nonlocal atoms. the so-called ghost force correction is employed. though nonlocal atoms do feel nearby local elements. Therefore.5 Formulation of Gradient and Hessian In order to obtain the equilibrium configuration of the solid. 2. particularly in the presence of lattice defects.1 Analytic Formulation of Gradient Using the chain rule. Using . unconstrained minimization of the potential energy is required. it follows that . In general. where is the weight function related with representative atom . The quasicontinuum procedure introduces degrees of freedom into the stored energy expressions which modify the computed constitutive properties to give better agreement with experimentally determined values.

5. To demonstrate this. . as follows: .(23) . (26) (25) It can be shown that the partial derivative the same is not true for the transpose is zero. Since. It is also useful to note that the general lack of symmetry in forces due to conjoined local/nonlocal domains. we refer to the following equations: . is the atomic level stiffness matrix. and . and is the first Piola-Kirchhoff stress tensor. it can be further obtained that .2 Analytic Formulation of Hessian Based on the gradient in (17). we may simplify equation 18 . also translates to asymmetry in the Hessian. . (24) where is the Lagrangian tangent stiffness tensor. 2. (28) 8 . and . (23) where is the first Piola-Kirchhoff stress tensor and is the element centroid. (27) where the only nonzero term in the summation is when based on the model in figure 2. in practice. However. in the absence of correctively applied ghost forces.

then QNR (29). 3. namely. the search direction at the first iteration is computed employing the steepest descent direction. From the discussions in the previous sections. convergence is said to occur if ||gk|| < ε is achieved. and approximate inverse (B) of the Hessian. Then a general minimization iterative solver framework for unconstrained minimization can be described by algorithm 1. While the NR method has better convergence properties. the computation of an exact H is time-consuming and may require large amounts of storage for large-scale problems. 31) are applicable.1 Nonlinear Conjugate Gradient Method . referred as G before). the Hessian H (equation 24. Normally. and NCG (30. The other search directions can be defined recursively as follows: . The NCG and the NR methods are also briefly reviewed. Solvers for Unconstrained Optimization In this section. steepest descent. •Algorithm 1: General iterative solver framework for minimization. In order to obtain the equilibrium configuration of the solid. (30) 9 .3. The nonlinear conjugate gradient method is of the following form: (29) where αk >0 is the step length and dk is search direction. . It is evident from this algorithm that if only the gradient is available. the TNR and QNR methods are presented. we have gradient g (equation 23). d0 = −g0. In algorithm 1. unconstrained minimization of the total potential energy Π (d) needs to be performed.

which transforms into the minimizing problem . to avoid underapproximating the step length.1. A backtracking line-search method is described in algorithm 1.1. This potential energy can be redefined as a single variable function as . Polak-Ribiere. where . where . Among common implementations.2 Backtracking The step length control in the line search algorithm is mainly governed by the Wolfe conditions. making this a popular class of algorithms for large-scale optimization. which is based on the sufficient decrease condition. such as the FletcherReeves. For nonlinear problems. and Hestenes-Stiefel methods. the Polak-Ribiere nonlinear conjugate method appears to have the best convergence performance (described in algorithm 2).1 Line Search Algorithm Define where d and u are given. Memory requirements are minimal.Nonlinear conjugate gradient methods use search directions that combine the negative gradient direction with another direction. thus performance is always predictable. •Algorithm 3: Backtracking line search: given and 10 . the function minimizer is found exactly within just n iterations. a condition is placed on the curvature and is given by . chosen so that the search will take place along a direction not previously explored by the algorithm. performance is problem dependent. Commonly used line-search methods are the NR method (where is needed) and the backtracking approach (9). For the linear problems. •Algorithm 2: Polak-Ribiere conjugate gradient with given tolerance ε: 3. Secondly. 3. but these methods have the advantage that they require only gradient evaluations. which first requires that to avoid over-approximating the step length .

The advantage of doing this is that fast convergence of the NR method will be obtained as the solution gets closer to the minimum. three point cubic interpolation methods are employed and are described briefly. the computation and storage requirements of the Hessian may become substantial. In large-scale problems.e. In our study. it follows that For efficient and robust implementation of the NR methods. i. where is a positive constant. is tried first.. The current guess should be sufficiently close to so that the truncation error in equation A-1 of the appendix is negligible. will converge to at a quadratic rate. . and the low-rank update method.. If these steps fail to satisfy the criterion for the decrease of the function. gold section.e.2 NR Methods . where (31) and . • • 3. This can be handled by either using the diagonal terms in the Hessian. 3. or avoiding recalculation at each iteration (can be done in instances of slow variation of the second derivative). In this section.. and . the difference method. 11 . we discuss only the low-rank update method.e. The NR method is not necessarily globally convergent.1 Numerical Formulation of Hessian Figure 3 illustrates several strategies in computing the Hessian (or the inverse of Hessian) (16–25). in the full Newton step. Assume that satisfy . Using such that is minimized. The backtracking line-search scheme is recommended in the Newton-Raphson algorithm. i.. the Hessian has to be positive definite to guarantee the search direction a descent direction. the following issues are critical: • To make the method converge toward a local minimizer. In such case. backtrack in a systematic way along the Newton direction. Then.Statement (3) in algorithm 3 is implemented via the bisection method. or polynomial interpolation method (9). i.2. These are the analytic method. i. The formulation of the analytic (exact) Hessian is given by equation 24. . which are categorized into three branches of numerical methods. .e. meaning that it may not converge from any starting point. ignoring the cross terms.

Numerical methods to formulate Hessian matrix. (33) 12 . where . Central Diff. Given as an approximation to the Hessian . an inverse approximate Hessian Table 2 lists four commonly used rank-2 approximate inverse Hessians. which are formulated subject to the following secant condition: . Backward Diff. For example. is more commonly used. Broyden-Fletcher-Goldfarb (BFGS) is generally considered to be the most effective variable metric method. 32. Various conditions are imposed on the approximate Hessian. BFGS PSB Gill−Murry DFP LBFGS Figure 3. super-linear. In many applications.2.2 Low-Rank Update Method A low-rank update method (10. The convergence about the associated gradient is linear.Numerical Strategy to Form Hessian Low Rank Update Analytic Method Difference Symmetric Rank−one Variable Metric Method Forward Diff. (32) .3 Convergence of the NR Method Based on Iterative Solver Let be a sequence converging to . Table 1 lists four commonly used low-rank update methods. and quadratic if there exists constants and such that . and it is usually kept positive definite. the low-rank update method iteratively updates this matrix by incorporating the curvature of the problem measured along the step according to the following secant condition (or Quasi-Newton condition): . 3.2. 3. 33) builds up an approximation to the Hessian by keeping track of the gradient differences along each step taken by the algorithm. its behavior along the step just taken is forced to mimic the behavior of the exact Hessian.

Broyden-Fletcher-Goldfarb (BFGS) .Table 1. 13 . PSB . Powell-Symmetric (PSB) . Rank-2 update approximation of Hessian inverse. BFGS . Davidson-Fletcher-Powell (DFP) where . . Approximate Hessian computed from previous iteration gradient information. SR1 (Symmetric Rank-1 Update ) Table 2. and . DFP . (34) Theorem 1 in the appendix shows that the truncated NR method converges at a super-linear rate. Gill and Murray .

LU factorization. •Algorithm 5: A preconditioned conjugate gradient solver for Gd = g. and (2) a truncated NR method. the NR method is utilized and described by algorithm 2. the optimal search direction can be obtained in the NR method. to obviate the difficulties that accompany its calculation and storage.4 Implementation of the NR Method In our work. update IF ( ENDFOR ) THEN exit. One may also perform a nontrivial port of the NR method onto distributed computer systems using representative-atom-based.. where the Newton equations in statement (8) of algorithm 2 are solved approximately by preconditioned conjugate gradient algorithm (PCG). (2) problems with special sparsity patterns. •Algorithm 4: Newton-Raphson algorithm with given tolerance : (1) (2) (3) (4) (5) (6) (7) (8) (9) Initialize FOR . domaindecomposition methods. /* search direction */ DO /* line search */ Two variants of NR strategies are: (1) a standard NR strategy where the Newton equations are solved using the direct method.. where the initial guess about the solution is given and C is the given preconditioner comprises algorithm 3. Quasi-Newton. The principle of these alternate methods is to obtain an approximation to the inverse of the Hessian matrix. i. However.3. discrete Newton.e.e. and truncated Newton methods arise. i. As a remedy. FOR . an out-of-core 14 .2. till Convergence DO /* Krylov Iteration */ /* update solution */ /* update residual */ /* preconditioning */ Using the Hessian matrix. (1) (2) (3) (4) (5) (6) (7) (8) (9) END FOR . . the memory requirements and associated with solving a linear system involving a Hessian restricts the NR methods only to (1) small-scale problems. or (3) near a solution. In light of these difficulties involving the Hessian.

deformation occurs by processes taking place within the grains. This truncated Newton method is based on the idea that an exact solution of the Newton equation at every step is difficult and unnecessary and can be computationally wasteful in the framework of a basic descent method. Among various methods. NCG method.truncated NR strategy is implemented and tested. While employing conjugate gradients (34) to approximate the search direction. the preconditioning based on incomplete Cholesky factorization (34) has the best convergence performance. Grain boundary sliding and diffusional creep are two major processes of boundary mechanisms. In the boundary mechanisms (7). This is the preconditioning approach employed in this work. only a diagonal-scaling preconditioner is often needed. deformation occurs by processes associated with the presence of grain boundaries. In the lattice mechanisms. Numerical Experiments and Results The deformation of crystalline materials is commonly studied based on lattice mechanisms and boundary mechanisms. the resulting convergence rate of algorithm 6 is strongly dependent on the condition number of . However. considering the large dimension of QC simulations and truncated nature of the search direction. The NR. and TNR methods are employed to simulate these two problems. QNR. The defining feature is that the search direction d in the truncated Newton algorithm is computed using algorithm 6. 15 . •Algorithm 6: Generating search direction d in truncated Newton method: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) IF (15) (16) ENDFOR RETURN Initialization: Iteration: FOR ( ) DO /* Section(1): negative curvature test */ IF ( ) RETURN /* Section(2): truncation test */ 4.

Sliding of grain boundary: (a) 1st load step. A convergence tolerance for the PCG is said to be reached once the following condition is met: . available in qcmethod. Figure 4. 10–2. In this report. a diagonal scaling preconditioner is used to accelerate the convergence of PCG. the TNR method reduces to the NR method while.4. and (d) 11th load step. implemented).com) and PCG iterative solver (algorithm 3. NCG. The solutions exactly match with the results of the NCG and NR methods found on qcmethod. In the limit. on the other hand. as the convergence criteria is decreased. Figures 5–7 compare the performance of the NR. 10–1 are employed in our numerical experiments for TNR method and r is the residual in the PCG solution iterations. it will incur more inner 16 . This problem is merely used as a test for convergence properties of the various solution algorithms. (c) 7th load step. the NR method is based on LU factorization (direct solver. (35) where ε =10–7. (b) 4th load step. 10–3. The shearing boundary conditions are applied at the upper and lower boundaries.1 Shearing Grain Boundary The grain boundary is comprised of two twins connected through an angled domain wall. Figure 5 shows that the number of required Newton iterations increases with the increase in the convergence criteria selected for the approximate search direction solver (PCG). Lateral boundaries are connected through periodicity conditions. In our implementation.com. and TNR methods. A grain boundary sliding process (7) employing the TNR method is demonstrated in figure 4.

CPU time for NCG. Figure 6. Number of minimization iterations for NCG. NR. NR. 17 . and TNR for shear grain boundary problem for 10th load step. and TNR for shear grain boundary problem. Figure 7. Time cost per minimizing iterations for TNR and NCG and number of iteration of linear conjugate gradient for the TNR for shear grain boundary problem.Figure 5.

and (d) 30th load step. a nanometer scale indenter is pushed at constant speed from a given height to a given depth into the material (loading). (b) 20th load step. 18 .2 Simulation of Nanoindentation In nanoindentation. and is subsequently retracted to its original position following the same path (unloading). A sequence of deformation of the simulations is shown in figure 8. Figure 6a and b shows the CPU time for the four methods considered. and TNR methods with respect Figure 8.4 s per iterations. In all the cases. the CPU time is less than the NR method. For single crystals the measured load-displacement response. which shows the force required to push the tip a certain distance into the substrate. shows characteristic discontinuities. (c) 25th load step. 4.5–1.PCG iterations. The NR and TNR methods reduce the number of nonlinear iterations by orders of magnitude. Nanoindentation problem: (a) 15th load step. It can be seen that for small PCG convergence tolerance (less than ε = 10–7) and larger PCG convergence tolerances (> ε = 10–1). the CPU time is less than the NCG method. NCG. Figure 10 shows the convergence performance of the NR. and the QNR methods. For tolerance between 10–7 < ε < 10–1. NR. Figure 9 shows performance comparisons of the NCG. From the figure. the CPU time is greater than the NR method. it is clear that the QNR performance is almost the same as the NCG method. Figure 7 shows that the NR method takes about 66 s per iteration and NCG only takes 0.

NR–PCG = TNR. the NR.Figure 9. (b) stage 1 of 1st load step. and QNR based rank-2 update Hessian for 10th load step of nanoindentation problem. and (e) 4th load step. Figure 10. (d) 3rd load step. (c) 2nd load step. Performance comparison of the NCG. 19 . Convergence by nonlinear iteration for nanoindentation: (a) initial load step.

1st load step and 10th load step.to various stages of the indentation process. (4) stage 1 of 10th load step. and (6) 5th load step. 5. with ε = 10–2 and with ε = 10–3 during the initialization stage. Figure 11. This is same performance as in the case of shear grain boundary problem considered previously. the computational cost incurred for a full Newton method. A truncated Newton method may therefore be employed with a careful selection of the convergence tolerance such that the search directions are predicted with reasonable accuracy. Many complexities left unconsidered in this work still remain. figure 11 shows the CPU times of the NR and NCG methods for various load stages. NR–PCG = TNR. 20 . Further. we can reasonably conclude that the NR methods are more efficient than the NCG methods. (3) stage 2 of 1st load step. with ε = 10–2 and with ε = 10–3 during the initialization stage. can be prohibitive in general. in which the Hessian is recomputed and stored at every instance. (5) stage 2 of 10th load step. First. (2) stage 1 of 1st load step. 1st load step and 10th load step. However. Convergence by timing cost (s) for nanoindentation: (1) initial load step. It can be seen that NCG takes the largest number of iterations followed by TNR. Conclusions In the context of the two example problems studied in this report. It can seen again that NCG takes most CPU time followed by the TNR method.

the “region of viability” of linear solvers can be rather small in the overall energy landscape of possible solutions. 21 . Such approaches are therefore intrinsically unsuitable for capturing the “nonconvex” issues involved in atomistic problems. the performance of matrix methods that are founded on linearization principles heavily depend on the condition of the material. Thus. Thus.the nature of atomic forces and interactions are highly nonlinear with respect to atomic coordinates. and methods such as the proposed TNR method may suffer significant limitations in general problems. in implicitly solving for quasistatic configurations of the material. one can identify states of deformation that result in instantaneous negative-definite configurations of atoms that violate the presumptions needed to employ linear solvers. namely its proximity to a minimum. Indeed.

Y. technique reports. G. R.ac. B. M. 1.. Matrix Computations. Yuan. 81 (3). 19. L. 2003. Shenoy.. Quasi-Newton Methods. 1991. 12. 47. E. H. 6. 1995. 501–516. Miller. E.. Beere. PA. 2002 (http://www. 358–372. O’Leary. C. in Linear and Nonlinear Conjugate Gradient-Related Methods. L. Dym. Nazareth. J. Journal of the American Ceramic Society 1998. 4. 22 . et al. M. A. Atomistic Simulations of Materials Fracture and the Link between Atomic and Continuum Length Scales. Shames. SIAM Rev. S. Curtin. F. I. 1996. 2. SIAM Journal on Optimization 1999. 3. J. Full Text Available A Nonlinear Conjugate Gradient Method with a Strong Global Convergence Property.html). PA. Journal of the Mechanics and Physics of Solids 1999. Eds. Lond. M. Solids 2001. S. Adams.. An Adaptive Finite Element Approach to Atomic Scale Mechanics the Quasicontinuum Method. Stresses and Deformation at Grain Boundaries. An Analysis of the Quasicontinuum Method. Nocedal. 9. Soc. R... Phys. et al. An Introduction to Algorithms for Nonlinear Optimization. Leyffer. 1899–1923. D. A Numerical Study of the Limited Memory BFGS Method and the Truncated-Newton Method for Large-scale Optimization. Philosophical Magazine A 1996.. 1977.. SI Unit Edition.. Van Loan. 7. 11. Gould. V. 10 (1). 10. Quasicontinuum Analysis of Defects in Solids. Atomistic/Continuum Coupling in Computational Materials Science.6. Y.. L. W. Ortiz. and Eng. J. 2nd ed. R.. Tadmor. RAL-TR-2002-031. Phil. Energy and Finite Element Methods in Structural Mechanics. H. 73.. Dai. Mor’e. 1993. N. MD. J. Golub. 8. 49. The Johns Hopkins University Press: Baltimore. Jr. Ortiz. J.numerical. A288 1978. J.. E. References 1. 46–89. Dennis. SIAM: Philadelphia. 5. G. 611–642. P.uk/reports /reports. 13.. Conjugate Gradients and Related KMP Algorithms: The Beginnings. pp 1–8. B. 33–88. Modeling Simulation of Material Sci. Taylor & Francis: Philadelphia.. F.. I. Trans. H. W.. Motivation and Theory. 1529–1563. Knap.rl. SIAM J. Mech. C. Cleri. Optim. Nash. J. 11. Phillips. 177–196.

18. M. G.. S. SIAM J. SIAM J. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. A. 1299–1316. Anal. 1996. Opt. 23 . Rose. II: Implementation Examples. Tadmor.. P. Dembo. 1285–1288. 1976. 71–111. 25. 20. 19. 1984. R. 26. M. Langmuir 1996. Statist. 1983. D. Num. A Generalized Conjugate Gradient Method for the Numerical Solution of Elliptic Partial Differential Equations. NY. Trans. et al.. M. 2002. 17. Theory of Algorithms for Unconstrained Optimization. 23. E. Phys. Nonlinear Continuum Mechanics for Finite Element Analysis. 693–726. R. 21. Quantum Mechanical Calculation of Hydrogen Embrittlement in Metals. Bonet.. A. Preconditioning of Truncated-Newton Methods. M. S. R. Mixed Atomistic and Continuum Models of Deformation in Solids. ACM. 400–408. Daw. T. 1985. Rev. Nash. Y. Inexact Newton Methods. 38.: Boston.. Ortiz.. 9. 46–70. Comput. B. Efficient Implementation of the Truncated Newton Method for Large Scale Chemistry Applications. Nocedal.. Prentice-Hall. E. Dennis. Eds. J. R. J. Preconditioned Conjugate-Newton and Secant-Newton Methods for Non-linear Problems. Phillips. Cambridge University Press: New York. Xie. D. D. J. 22. Tadmor. 4529–4534. Steihaug. 27... Journal of Computer-Aided Materials Design 2002. 50. 12. M. Fogelson. International Journal for Numerical Methods in Engineering 1989. C. I: Algorithm and Usage.. Schlick. Software 1992. 203–239. Schlick. R. Schlick. ACM. Bunch. 24. T. 6. S. R. Jr. Miller. Semiempirical.... MA. Powell. 199–242. E. Sparse Matrix Computation. C.. Eisenstat. Schnabel. TNPACK A Truncated Newton Minimization Package for LargeScale Problems. T. 1982. SIAM J.. Inc. Wood.: Englewood Cliffs. Academic Press: New York. B. J.. 28. TNPACK A Truncated Newton Minimization Package for LargeScale Problems. T. D. Baskes. Updating Conjugate Directions by the BFGS Formula. Software 1992. Saad. Papadrakakis. 18. Fogelson. B. Programming 1987. 19. J. Math. Gantes. The Quasicontinuum Method: Overview. Math. Applications and Current Directions. 132–154. Lett. 15. 1999. 10. Math. E. 28. J. 599–616. Concus.. 18. Acta Numerica 1992. Iterative Methods for Sparse Linear Systems. Sci.14. NJ. NY. J. Trans. 16. PWS Publishing Co.

30. 155–175. 31. Comp. 27. 297–330. P. F. 368–381. Broyden. SIAM J. T. 1984. C. 33. G. Horwood Publishing Ltd. Math. 34. C. L. QN-like Variable Storage Conjugate Gradients. 1998. On Diagonally Preconditioning the Truncated Newton Method for Super Scale Linearly Constrained Nonlinear Programming. 1967. 2001. 24 . Brown. N. Leonard. E. M. 21. Res. 12 (1).29. A. Reduced-Hessian Quasi-Newton Method for Unconstrained Optimization. A. SIAM J.. Gill. Ross. 401–414. W. Advanced Applied Finite Element Methods. Saad. Quasi-Newton Methods and Their Application to Function Minimization. European J. Escudero. Mathematical Programming 1983. Optim.: Chichester. 4.. Opt. 32. LeNir. Oper. P. Convergence Theory of Nonlinear Newton-Krylov Algorithms. F. 17. Y. 1994. London. 209–237. Buckley..

Supporting Derivations A. which leads to the following Newton equation: .Appendix. Mathematically. where both gradient and Hessian error. A line search is still required to find an appropriate . and the step length is set to be 1. it is known that is a symmetric positive definite (SPD) matrix. By ignoring the truncation . Taylor’s theorem implies the following: 25 . (A-2) .0 by default due to the existence of truncation error in equation A-1. which approximates the elements of the Hessian using a finite differencing technique. (A-3) (A-4) (A-5) As a result. resulting in is always symmetric if is twice continuously differentiable around . According to the necessary condition of minimization.2 Difference Hessian The difference Hessian method is evaluated for a large scale three-dimensional situation in the companion report.1 Derivation of Newton-Raphson (NR) Method The NR method is based on a quadratic approximation to the given function minimization iteration. A. and is motivated by Taylor’s theorem. Assuming that is the minimizer. When the second derivatives of objective function exists and are Lipschitz continuous. then . NR is obtained by defining the search direction as . it follows from equation A-1 that are evaluated at in each (A-1) . The Taylor expansion yields the following: .

e. a local minimizer 3. has a super-linear convergence rate. is nonsingular and that is Lipschitz continuous at . . based on an iterative solver. the search direction is obtained (A-12) with the following convergence criterion: . it only requires the numerical evaluation of the gradient g. let where be a small increment such that we obtain the following: .. an approximate Hessian can be formulated using the following central difference method: (A-10) Let B be the approximation of G by ignoring the truncation error. 26 (A-13) is continuously differentiable in a neighborhood of .e. it follows that (A-6) (A-7) (A-8) Similarly. According to equations A-8 and A-9. The truncated NR method is employed to find by solving the following Newton equation: . By substituting . ...” is scalar product. Theorem 1: Convergence Rate of TNR: Assume that 1. where “. Then the truncated NR method (see references 16–25 in section 6 of this report). (A-11) (A-9) Then the difference Hessian method is easily formulated and suitable for parallel computing (each element of the Hessian is formulated independently). of . In addition. i. i. 2.

if is sufficiently close to . then equation A-14 holds. (A-14) for all that is sufficiently Proof: Let close to be SPD and assume there exists a constant . (A-17) 27 . it follows that . As a result. using equation A-13. (A-16) According to the Taylor’s expansion about the gradient Thus. Thus. then . it follows from equations A-12 and A-13 that .where is the accuracy tolerance. it follows that . (A-15) .

NO. OF COPIES ORGANIZATION 1 DEFENSE TECHNICAL (PDF INFORMATION CTR ONLY) DTIC OCA 8725 JOHN J KINGMAN RD STE 0944 FORT BELVOIR VA 22060-6218 1 US ARMY RSRCH DEV & ENGRG CMD SYSTEMS OF SYSTEMS INTEGRATION AMSRD SS T 6000 6TH ST STE 100 FORT BELVOIR VA 22060-5608 INST FOR ADVNCD TCHNLGY THE UNIV OF TEXAS AT AUSTIN 3925 W BRAKER LN AUSTIN TX 78759-5316 DIRECTOR US ARMY RESEARCH LAB IMNE ALC IMS 2800 POWDER MILL RD ADELPHI MD 20783-1197 DIRECTOR US ARMY RESEARCH LAB AMSRD ARL CI OK TL 2800 POWDER MILL RD ADELPHI MD 20783-1197 1 1 3 ABERDEEN PROVING GROUND 1 DIR USARL AMSRD ARL CI OK TP (BLDG 4600) 28 .

OF COPIES ORGANIZATION 1 DIR USARL AMSRD ARL SE E N FELL 2800 POWDER MILL RD ADELPHI MD 20783-1197 DIR USARL AMSRD ARL SE EI N DHAR J LITTLE 2800 POWDER MILL RD ADELPHI MD 20783-1197 DIR USARL AMSRD ARL SE RL P AMIRTHARAJ L CURRANO J PULSKAMP A WICKENDEN 2800 POWDER MILL RD ADELPHI MD 20783-1197 ABERDEEN PROVING GROUND 22 DIR USARL AMSRD ARL CI H C NIETUBICZ AMSRD ARL CI HC J BLAUDEAU J CLARKE C CORNWELL B HENZ R NAMBURU D SHIRES A YAU AMSRD ARL WM M GREENFIELD J MCCAULEY J SMITH T WRIGHT AMSRD ARL WM BD B FORCH M HURLEY AMSRD ARL WM MA J ANDZELM M VAN LANDINGHAM AMSRD ARL WM TA S SCHOENFELD AMSRD ARL WM TD T BJERKE J CLAYTON K IYER B LOVE S SEGLETES 2 4 29 .NO.

30 .INTENTIONALLY LEFT BLANK.