Vous êtes sur la page 1sur 129

ROVIBRATIONAL SPECTROSCOPY CALCULATIONS USING A WEYL-HEISENBERG WAVELET BASIS AND CLASSICAL PHASE SPACE TRUNCATION by RICHARD LUZI LOMBARDINI,

B.S., M.S. A DISSERTATION IN PHYSICS Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Approved

Bill Poirier
Chairperson of the Committee

Greg Gellene

Wallace Glab

Thomas Gibson

Accepted

John Borrelli
Dean of the Graduate School

August, 2006

Copyright 2006, Richard Luzi Lombardini

ACKNOWLEDGMENTS This was my rst attempt exploring the unknowns of theoretical physics and chemistry. Any successes that I achieved were only possible through the patience and guidance of my advisor Bill Poirier. Encouragement was given to me unconditionally by my parents, Barry and Leila Lombardini, that was vital to the completion of this degree. Peers and colleagues in the chemistry department, Jason McAfee, Justin Rajesh Rajian, and Buddhadev Maiti, especially those who were my cellmates in the Poirier lab, Sean Xiao, Junkai Xie, Wenwu Chen, Jason Montgomery, Akbar Salam, and Corey Trahan, made the experience enjoyable and less frustrating since we were all in the same boat. Some of these calculations (Chapter IV) were done on Jazz, a 350-node computing cluster operated by the Mathematics and Computer Science Division at Argonne National Laboratory, and I am grateful for the sta at Argonne for their technical help. Some of the sta (Srirangam Addepalli and David Chan) at the Texas Tech High Performance Computing Center were very helpful with issues involving parallel programming. Last of all, I would like to recognize and thank my committee members (the three Gs), Greg Gellene, Wallace Glab, and Thomas Gibson, for trying to make sense of it all.

ii

TABLE OF CONTENTS ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. MOTIVATION AND BACKGROUND . . . . . . . . . . . . . . . . . . . . 1.1 1.2 Basis Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Organization of Dissertation . . . . . . . . . . . . . . . . . . . . . . . ii v vi vii 1 4 11 13 13 17 17 18 20 20 21 24 28 29 31 31 33 35

II. ROVIBRATIONAL SPECTROSCOPY CALCULATIONS OF Ne2 . . . . 2.1 2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 2.2.2 2.3 Phase Space Truncation . . . . . . . . . . . . . . . . . . . . . Weylet Basis Set . . . . . . . . . . . . . . . . . . . . . . . . .

Matrix Representations and Numerical Implementation . . . . . . . . 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 Momentum-Symmetrized Gaussian Expansion . . . . . . . . . Matrix Representation: 1 DOF Case (Radial Ne2 ) . . . . . . . Numerical Implementation: 1 DOF Case (Radial Ne2 ) . . . . . Matrix Representation: 3 DOF Case (Cartesian Ne2 ) . . . . . Numerical Implementation: 3 DOF Case (Cartesian Ne2 ) . . . Results for Radial Ne2 (1 DOF Case) . . . . . . . . . . . . . . Results for Cartesian Ne2 (3 DOF Case) . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.4

Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 2.4.2 2.4.3

III.CUSTOMIZED PHASE SPACE REGION OPERATORS APPLIED TO BASIS SETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Weylets and Momentum-Symmetrized Gaussians (SGs) iii . . . 46 46 50 50

3.2.2 3.2.3 3.3

Phase Space Region Operator (PSRO) . . . . . . . . . . . . . PSRO for the Harmonic Oscillator (HO) . . . . . . . . . . . .

51 53 55 55 56 58 59 59 61 62 64 78 78 83 85 91

Numerical Implementation . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 3.3.2 3.3.3 Morse Oscillator (1 DOF) . . . . . . . . . . . . . . . . . . . . Morse/Harmonic Oscillator (2 DOF) . . . . . . . . . . . . . . Harmonic Oscillator (HO) . . . . . . . . . . . . . . . . . . . .

3.4

Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 3.4.2 3.4.3 3.4.4 Results for Morse Oscillator (1 DOF) . . . . . . . . . . . . . . Results for Morse/Harmonic Oscillator (2 DOF) . . . . . . . . Results for Harmonic Oscillator (HO) . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

IV. PARALLEL PREPROCESSED SUBSPACE ITERATION METHOD . . . 4.1 4.2 4.3 4.4 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . Parallel and Numerical Implementation . . . . . . . . . . . . . . . . . Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 APPENDICES A. JUSTIFICATION OF SPHERICAL TRUNCATION CONDITION . . . . 107 B. EIGENFUNCTIONS OF HARMONIC OSCILLATOR PSRO . . . . . . . 109 C. EIGENVALUES OF HARMONIC OSCILLATOR PSRO . . . . . . . . . . 115 D. JUSTIFICATION OF PREPROCESSING AND SUBSPACE ITERATION METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

iv

ABSTRACT New basis set methods are examined regarding quantum mechanical calculations of energy levels and wave functions of bound systems. The rst method (I) involves compact orthogonal wavelets as the basis set which is subsequently truncated using the guidance of a classical phase space picture of the system. In this dissertation, the rst application of this technique to a real molecular system (neon dimer) is presented, and many of the technical details are developed for its use on any arbitrary system. Although in many respects, neon dimer represents a worst-case scenario for the method, it is still competitive with another state-of-the-art scheme applied to the same system. The second method (II) greatly improves the computed accuracies of the rst through the introduction of phase space region operators, which increase the eciency K/N of the basis set, where N is the number of basis functions needed to calculate K energy eigenvalues to a given level of accuracy. For one model system, the absolute error of the computed energy levels is reduced by nearly 4 orders of magnitude, as compared to method I. Finally, a new parallel algorithm for matrix diagonalization (method III) is introduced, which uses a modied subspace iteration method. The new method exhibits great parallel scalability, making it possible to determine many thousands of accurate eigenvalues for sparse matrices of order N 106 or larger.

LIST OF TABLES 2.1 2.2 Gaussian Cap Parameters . . . . . . . . . . . . . . . . . . . . . . . . Ground and First Excited Vibrational (J = 0) Level Energies for 1 DOF radial Ne2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Ground and First Excited Vibrational (J = 0) Level Energies for 3 DOF Cartesian Ne2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 3.1 3.2 3.3 3.4 3.5 3.6 3.7 4.1 Rovibrational Bound States of Ne2 . . . . . . . . . . . . . . . . . . . Vibrational Bound Energy Levels of 1 DOF Morse Oscillator . . . . . 2 DOF Morse/Harmonic Oscillator . . . . . . . . . . . . . . . . . . . 1 DOF Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . 2 DOF Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . 3 DOF Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . 4 DOF Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . 2 DOF Harmonic Oscillator (Multiple Application of PSRO) . . . . . Parallel Algorithm Versus Nonparallel Direct Diagonalization . . . . . 38 39 65 66 67 68 69 70 71 95 37 37

vi

LIST OF FIGURES 2.1 2.2 2.3 2.4 2.5 2.6 3.1 3.2 3.3 3.4 Various Weylet Basis Sets in 1 DOF . . . . . . . . . . . . . . . . . . . Plots of Symmetrized 1 DOF Weylets in Conguration Space . . . . . Lennard-Jones Potential for Ne2 System . . . . . . . . . . . . . . . . Modied Lennard-Jones Potentials (Gaussian Cap) . . . . . . . . . . Phase Space Truncations for 1 DOF Radial Ne2 System (J = 0) . . . Phase Space Truncations for 3 DOF Ne2 System . . . . . . . . . . . . 40 41 42 43 44 45

Classically Allowed Region of Weylets and 1 DOF Harmonic Oscillator 72 Wigner-Weyl Representations of Weylets and 1 DOF Harmonic Oscillator 73 Classically Allowed Regions of Weylets and 1 DOF Morse Oscillator . Classically Allowed Region of Weylets and 2 DOF Morse/Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 76 77 96 97 74

3.5 3.6 4.1 4.2 4.3

Eciency Versus N . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eciency Versus DOFs . . . . . . . . . . . . . . . . . . . . . . . . . Node Communication Setup . . . . . . . . . . . . . . . . . . . . . . . Node Communication Setup for the Last Stage . . . . . . . . . . . . . Number of Eigenvalues at a Relative Accuracy Versus N for the 3 DOF Isotropic Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . .

98

4.4

Number of Eigenvalues at a Relative Accuracy Versus N for the 3 DOF Anisotropic Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . 99

4.5

Number of Eigenvalues at a Relative Accuracy Versus N for the 6 DOF Isotropic Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . 100

vii

CHAPTER I MOTIVATION AND BACKGROUND By themselves, weakly bound or oppy molecular systems are very interesting in that they defy notions of typical molecular behavior. However, the particular study of clustersi.e., 2 1000 monomers (either atoms or molecules) bound together by van der Waals forces or hydrogen bondingopens new doors of exploration of physical chemistry at the fundamental level, with respect to the ability to control and vary the cluster size. With the advancement of experimental techniques involving the synthesis of clusters at variable sizes, 1 in conjunction with high-resolution spectroscopy at frequencies ranging from the microwave to the ultraviolet realm, experimentalists have been able to work closely with theorists whose struggles are, obviously not size control per se, but instead, severe size limitations for accurate calculations due to the inherent anharmonicities coming from the weak bonds (to be addressed in more detail later). Within the realm of cluster research, one eld of study involves vibrationally induced dynamical phenomena (in hydrogen-bonded molecular complexes) which is important for gaining a better understanding of energy transfer from reactants to products in chemical reactions. Such investigations allow one to gain insight on the ne line that separates time-independent spectroscopy and time-dependent chemical dynamics. The hydrogen uoride dimer (HF)2 is a good example; here, the main focus lies on vibrational predissociation, more specically, the transfer of energy from the excited HF monomer vibration to the weak intermolecular bond, which eventually breaks. To gain the full picture of the process, both theoretical 2 and experimental 3 methods traditionally used for bound states are needed in conjunction with scattering calculation techniques 4 and experimental setups 5,6 conducive to measuring dissociation probabilities. Another area of interest, for which clusters are ideally suited, is the study of how microscopic properties evolve into macroscopic behavior, since clusters serve as a 1

type of transitional form of matter, intermediate between independent units and bulk matter. 7,8 One example is the study of rst order phase transitions. At the cluster level, systems go through phase changes rather than phase transitions. One key dierence is that melting and freezing points are not identical. 9 For the range of temperatures in between, the system can hop back and forth between phases, similar to the dynamics of coexisting isomers. For rare gas clusters, especially those comprised of lighter atoms (such as Ne), quantum eects can have a great inuence on these unusual phases, sparking the development of ecient theoretical quantum statistical techniques for many-body systems. 10 In principle, as one increases the size of the cluster, the phase changes should approach the familiar rst order phase transitions of statistical mechanicsalthough this bridge has never been theoretically or experimentally traveled upon since typical bulk sizes (or sizes at which these macroscopic physical properties start to appear) are so much larger than clusters which have sizes that one can directly manipulate. The same can be said for ion solvation, which is examined at the molecular level with cluster size solvent shells in the gas phase (very dierent than in solution). Stace, 11 however, argues that one can extract useful thermodynamical information for the bulk counterpart from studying individual ion-solvent intermolecular interactions. 12 Strictly on the theoretical front, the anharmonic behavior of the large amplitude motion of cluster systems, due to multiple shallow minima separated by low isomerization and dissociation barriers on the potential surface, renders accurate spectroscopic calculations of energy levels extremely challenging. Despite the simplication oered by the Born-Oppenheimer approximation, in which the electronic degrees of freedom (DOFs) can be eectively removed, the numerical solution of the nuclear timeindependent Schrdinger equation becomes computationally expensive very quickly o with respect to increasing cluster size, since, for the most part, the nuclear DOFs must be treated as being coupled. The traditional picture of treating the system approximately as decoupled single-DOF vibrational oscillators, and subsequent nor-

mal mode analysis, breaks down, due to the large anharmonicity which cannot be regarded as a small perturbation. Using basis set methods, which is the method of choice throughout this dissertation, the coupling of all DOFs translates to the need for a rather large matrix representation (or equivalently, basis set of large size N ) of the nuclear Hamilto nian operator, H, in order to numerically calculate accurate energy values and wave functions of H. Diagonalization of the corresponding N N matrix representation Hordinarily, the computational bottleneckdirectly provides the linear expansion coecients of the eigenfunctions of H, in terms of the N basis functions. With suciently large N , these numerical solutions approximate the true wave functions closely, at least towards the bottom of the spectrum; the corresponding eigenvalues of H, in the same sense, should be close to, yet always larger than, 13 the actual H. If there is any separability in the Hamiltonian or decoupling of DOFs, the Hamiltonian can be broken up into smaller parts, each of which can be represented by smaller matrices (since there are less DOFs) and solved separately. The worst, but most typical, case of no decoupling may require one to be very innovative when choosing an appropriate coordinate system and basis set, with the purpose of making the computations tractable. For optimal coordinate systems, Bai cc and Light 14 suggest four crucial criteria to consider: the coordinates need (1) boundary conditions that cover all of the conguration space that the system can possibly span, (2) to exploit the highest symmetry of the system, (3) to be orthogonal such that the kinetic energy operator has the least possible number of cross terms, (4) and nally, if possible, to be carefully chosen such that there is the least amount of coupling between vibrational modes (or DOFs). Throughout this dissertation, Cartesian coordinates are used (there is a slight exception in the 1 DOF case of chapter II) which fully satisfy conditions (1) and (3); the requirements of (2) and (4) are not dealt with, although these are worth exploring in future works. Although curvilinear coordinates could successfully address (4), they are, in general, more dicult to use and lack the universality of Cartesian coordinates. As discussed in Ref. [15], one does 3

not have to deal with variable coordinate limits and boundary conditions, cross terms in the kinetic energy operator, and other system and/or coordinate specic adjustments when using Cartesian coordinates. Despite the benets, the foremost reason these are used here is that the Weyl-Heisenberg wavelets or weylets, which are the basis functions of choice throughout this dissertation, are at present only dened in Cartesian coordinates. This dissertation mainly focuses on basis sets and the computational techniques that accompany them, with the expectation that the combined eort can lead to successful calculations of weakly bound cluster systems that would otherwise not be possible, using conventional methodologies. A key metric in any discussion of basis set methods is the basis set eciency K/N , where N is the number of basis functions needed to compute K eigenvalues to a desired level of accuracy. The next section provides a discussion of the basis set eciency for the present basis set ideas, leading up to a justication for the use of phase space truncated weylets. 1.1 Basis Sets Despite their nonorthogonality, one popular choice of basis is the real Gaussian functions in conguration (position) space. Gaussians of 1 DOF are simple in that each function depends upon only two parameters, width and center. Also, contributing to their popularity is the convenience they provide in calculating Hamiltonian matrix elements: the kinetic and overlap matrix elements have analytical representations, and the potential matrix elements may always be obtained using Gauss-Hermite quadrature 16 typically requiring very few quadrature points, due to the good localization and lack of oscillations of the Gaussian function. The localization property allows for two further benets. First, in 1 DOF at least, one can eectively target eigenstates of an arbitrary Hamiltonian below a chosen energy value. Second, the localization results in eectively sparse (most matrix elements nearly zero) Hamiltonian and overlap matrices, allowing for the application of fast computational techniques.

In addition to the above inherent characteristics, the literature discusses improvements upon the eciency of the Gaussian basis set, i.e., basis optimization schemes for selectively choosing the center and width of each Gaussian function, sometimes guided by physical arguments. Earlier works 17,18 developed accurate algorithms for choosing parameters of a single DOF Gaussian basis, in accordance with a balance of semiclassical (SC) arguments and basis overlap criteria. The ultimate goal was to achieve optimal eciency while avoiding linear dependency issues. The SC methods produce a basis of narrow Gaussian functions densely distributed in regions of high momentum (low potential) and wide functions sparsely distributed in regions of low momentum (high potential) in conguration space. Unfortunately, the strict SC-Gaussian methods do not work well in more than 1 DOF, due to a lack of either straightforward applicability 17 or optimal convergence. 18 Refs. [17] and [19]-[22] address this issue and propose new approximate SC methods with 19,20,22 or without 17,21 additional non-SC techniques. Other sources 23,24 report alternatives to SC thought in Gaussian basis development. The latter techniques are critical for achieving respectable eciencies on systems involving high energy, heavy particle dynamics. Other methods have been developed using dierent basis functions such as nonlocal plane waves, 25,26 conguration space grid functions, 2729 or wave functions of solvable Hamiltonian systems. 30 The corresponding methodologies involve either point transformations of the coordinates of the basis functions, 27,28,30 the variational principle, 29 or a combination of both. 25,26 Although impressive eciencies have been achieved at low DOFs, all of these approaches have compromising issues at high DOFs. The methods of Refs. [25]-[29], for instance, exhibit exponential decay of the eciency as the number of DOFs increase, which is an inherent problem of direct product basis sets (DPBs). 14,24,29,31 Cargo and Littlejohn show that their proposed canonical transformation does not produce eciencies at higher DOFs tantamount to the results obtained in their 1 DOF Morse oscillator example. 30

In this dissertation, a dierent basis methodology is usedthe only one currently known that formally defeats exponential scaling. The method hinges upon a phase space picture that is applied to both the representational basis set, and the desired eigenstates of the target application system. It exploits the fact that the simple phase space picture used becomes exact in the Wigner-Weyl (WW) sense, 32,33 in the large basis limit. More specically, in order to decide upon an optimal basis set, one must have a clear picture of how the wave functions, |i , of H are represented on phase space. The set of K orthonormal |i s that lie below some energy Emax span a subspace of the Hilbert space that can be represented by the projection operator
K

K =
i=1

|i i | .

(1.1)

The WW phase space representation of this operator (labeled K ) is a probability distribution function that is well-contained within the classically allowed region, R, i.e. the region enclosed by the energy surface H = Emax , where H is the classical phase space Hamiltonian function corresponding to H, under the WW mapping. The phase space picture presumes that Emax is quantized such that the possible volumes of R are K(2 )f , where f is the number of DOFs. In other words, each |i corresponds h to a non-overlapping region of volume equal to that of a Planck cell. This picture becomes more accurate as K increases, in which the limit of K approaches a uniform value of one within R, and zero outside. 34 The phase space picture, as discussed above, can be applied to any type of representational basis set, provided the basis is orthogonal. It can also be generalized for nonorthogonal basis sets, although in this case, the picture must be modied in subtle ways. For instance, complex-valued or phase space Gaussians (PSGs), at rst glance, seem to be very good candidates for basis functions, especially under the guidance of the phase space picture. First, their average momentum and position values, which are also the parameters that distinguish one PSG from another, are simply the centers of their real-valued WW Gaussian representations. Each PSG can be mapped to another by Weyl-Heisenberg phase space translation operators, hence they are also 6

referred to as Weyl-Heisenberg coherent states. Second, the WW representation of each function is well-localized within a Planck cell region around the center. However, this property does not necessarily extend to a nite collection of PSGs on phase space. This can be best explained by introducing the projection operator N which represents the nite set of PSGs, |i , i.e.,
N

N =

[S 1 ]ij |i j |

(1.2)

i,j=1

where each element of the overlap matrix, S, is given by Sij = i |j . Unlike Eq. (1.1), Eq. (1.2) has cross terms due to the nonorthogonality of the PSGs. The magnitudes of these terms are directly related to the degree of nonorthogonality of the PSGs and also signify the degree of collective delocalization. Consequently, N can have signicant probability far from the centers of all N individual PSGs, depending upon how closely these are bunched together. Thus, although individual PSGs are well-localized, collectively they need not be. This has important ramications for the phase space truncation scheme, 3537 i.e. the method used to restrict the representational basis in order to achieve the highest possible eciency. This method is extremely simple, i.e. retain only those basis functions whose centers lie within or near K . In order to be eective however, this requires that N be well-localized about the basis phase space centers, which in turn requires that the basis |i be orthogonal. The ideal basis should therefore be both localized and orthogonalprecisely the two dening properties of weylets, 3537 as will be discussed later. First though, we nd it useful to continue our discussion of PSGs. The set of all PSGs, a.k.a. coherent states, comprises an innitely overcomplete family of vectors in the Hilbert space, that nevertheless satises a certain resolution of the identity. 38 The completeness aspect reinforces the idea of using PSGs for the representation of arbitrary quantum states; however, overcompleteness leads to a new drawback, i.e. linear dependence, quite distinct from the issue of nonorthogonality and collective nonlocality. To remedy this, one work 39 suggests the use of subsets of PSGs (in the 7

1 DOF case) with centers lying on a line or circle in phase space. The goal is to limit the basis expansion choosing from a pool of functions grouped together by some simple criterion. Other ideas, within the connes of single dimensional manifolds, have ranged from the random placement of PSG centers forming an irregular polygon, 40 to their selective placement along the classical energy surface H = E of the system in question, in order to get ecient representations of wave functions at energies close to E. 41 On the other hand, a discrete rectilinear lattice of PSGs 41 would be easier to handle than the constructs mentioned above, particularly for multi-DOF applications. The most popular lattice arrangement is the von Neumann lattice 42 where there is one PSG per rectangular (or hypercubical) Planck cell. This density is critical with respect to providing completeness without linear dependence, and is denoted d=1. Use of the von Neumann lattice has spread to the elds of condensed matter, 43,44 quantum optics, 4547 and molecular physics 41 regarding the representation of arbitrary quantum states. They have also gained notice outside of quantum mechanics, in the realm of communication theory involving signal decomposition and transmission. 48 The completeness of these functions has been well-established, 4951 and minimal expansions using truncated sets of lattice functions have been shown to be robust 45 and extremely ecient 46 in representing harmonic oscillator and squeezed states. These ndings support the use of von Neumann lattice functions as a basis for the calculation of energy eigenvalues and wave functions. Unfortunately, Davis and Heller 41 found that the convergence of the eigenvalues was slow with respect to the number of basis functions for arbitrary systems not resembling that of the abovementioned harmonic oscillator or squeezed state (which are special cases), in eect, because of the collective nonlocality problem discussed earlier. They showed that improved performance could be achieved by increasing the lattice density, i.e., d > 1, but this introduces near linear dependencies into the basis, and further delocalizes N .

Most importantly, however, d > 1 implies exponential scaling of basis set eciency with respect to the number of DOFs. Poirier 35,36 applied the Lwdin canonical orthogonalization procedure 52 to the o set of single DOF (d = 1) von Neumann lattice PSGs in the hopes of improving eciency, by introducing orthogonality into the basis. Unfortunately, the resultant orthgonalized basis functions are no longer individually well-localized, even when the localization is optimized. In fact, Balian 53 and Low 54 proved that all critically dense lattices of states supported by the Weyl-Heisenberg group cannot satisfy the properties of orthogonality and good phase space localization, simultaneously. Wilson 55 developed a simple trick for eluding the Balian-Low no go theorem. By using bimodal basis functions (in the single DOF case), consisting of both positive and negative momentum components in a symmetric fashion, and simultaneously working on a doubly dense lattice, d = 2, he was able to construct a complete, orthonormal lattice representation for which all basis functions decay exponentially in phase space. Daubechies et. al. 56 applied Wilsons idea to a set of tight frame functions, each composed of an expansion of doubly dense PSGs. In this dissertation, we label these types of functions as Weyl-Heisenberg wavelets, or weylets, WeylHeisenberg because they are transformed into each other via the operators of the Weyl-Heisenberg group, and wavelet because, in the 1 DOF case, there are two parameters or quantum numbers (signifying the center of the weylet on 2-dimensional phase space) needed to label each function in the set. Using the ideas of Wilson and Daubechies, Poirier 3537 then derived the optimally localized weylet basis in 1 DOF, as well as, an ecient numerical scheme for their construction, rendering them computationally practical for multi-DOF bound state calculations. The Poirier weylets are extremely well-suited to the phase space truncation scheme 3537 mentioned previously. The weylets are easily extended to f DOFs, where approximately, one can think of the weylet N as a group of 2f dimensional blocks (or Planck cells) that are not overlapping and are concentric with the N individual weylets. This uniform region, R [with a volume of N (2 )f ], becomes h 9

more of an accurate picture of N as N increases, following the same principle as that described in Ref. [34]. In practice, the truncation scheme involves keeping those N blocks whose centers on phase space have classical energies less than some chosen Ecut parameter which is usually chosen to be slightly larger than, if not the same as, the energy Emax , the maximum cuto of the K wave functions of interest. Wasted space/ineciency (N > K) is only manifested by those lattice blocks on the periphery which only partially overlap R. Overall, the eciency does not exponentially decay as DOFs increase, unlike DPB methods, and approaches perfection (K/N 1) in the large K and N limit since R and R begin to resemble each other on phase space. Phase space truncation and maximal phase space localization are what distinguish weylets from the very popular DPBs, which warrants further discussion. Like DPBs, the multi-DOF weylets are products of single DOF functions in each of the coordinates, i.e.,
f

i (q) =
j=1

ij (qj )

(1.3)

where i = (i1 , i2 , . . . , if ) and q = (q1 , q2 , . . . , qf ) (although in general, DBPs do not necessarily use identical functions for each DOF). For the DPB case, because basis truncation is applied to each DOF independently, the corresponding R (approximate WW region on phase space representative of the DPB set) adopts a cylindrical shape, i.e., R = R (1) R (2) . . . R (f ) (1.4)

where each R (j) corresponds to 2-dimensional phase space regions representing sets of Nj single-DOF basis functions, i.e., N =
f j=1

Nj . If one were to attempt to mold R

to resemble the K region R which is not cylindrical, then there would be signicant wasted space in the corners, corresponding to extra basis functions in N . The result is exponential scaling of K/N with f , 24,29 even if one optimally determines the individual R (j) to produce a product region R that most eciently covers R. 29 On the other hand, the phase space truncation scheme 3537 precisely removes those problematic regions. 10

Despite the poor eciency at many DOFs, DPBs are still very popular because they are convenient, and in some cases necessary, e.g., for the discrete variable representation method (DVR). 5762 The DVR method uses a conguration space grid representation which has the tremendous advantage that potential matrix elements can be computed without the need for costly integrations. For DPBs that give sparse matrices H, iterative eigensolvers can be used such as the Lanczos methodalthough, as pointed out by Dawes and Carrington, 63 nobody has been able to advance beyond four-atom molecules using DPB iterative methods, which has been an ongoing area of study for over ten years. 64 The build and prune approach, 63 an idea that has been around for over 20 years, 6567 represents a substantial improvement to DPBs, vis-`-vis advancing theoa retical spectroscopic analysis beyond DPB limits. Functions within the DPB set that negligibly contribute to the target states are pruned away, resulting in a correlated truncation of DPB functions (i.e. truncations for individual DOFs are no longer independent of each other). The present weylet approach with phase space truncation is thus another build and prune method. However, unlike all other strategies, the weylet version is the only one currently known that defeats exponential scaling. Proving their worth, the weylet calculations have already been applied successfully to model systems up to 15 DOFs and beyond, 37 which is a record using direct matrix diagonalization techniques. 1.2 Organization of Dissertation There are three parts to the body of this dissertation that are each complete in that they contain all necessary explanations and background sucient for them to be independent works. The rst part, i.e. chapter II, describes the rst application of the phase space truncated weylets to a real molecular system, the weakly bound neon dimer in its ground electronic state. The majority of the chapter addresses the technical details needed to eciently calculate the matrix representation of an arbitrary potential up to 3 DOFs in the weylet basis; the development of the kinetic 11

matrix is comparably insignicant because the matrix is sparse and the elements can be found via analytical expressions. The subsequent part (Chapter III) uses projection operators that are reective of the system to customize the individual weylet functions themselves (as opposed to just their truncation), resulting in a new nonorthogonal basis representation with large improvements in accuracy and eciency. The mathematical ideas for the implementation of the operators are inspired by works of Bracken, Doebner, and Wood. 68,69 As a preliminary work, the new method is only applied to model systems up to 4 DOFs in the hopes that, with future developmental eorts, it could be used for real systems. Chapter IV, the last part of this dissertation, presents a new iterative method for diagonalizing large sparse symmetric matrices. We test the approach by applying it to harmonic oscillator systems up to 6 DOFs represented in the truncated weylet basis. The method uses a subspace iteration method that is very suitable for parallelization (more so than Lanczos) if a simple preprocessing procedure is performed on the matrix beforehand. The ideas for the preprocessing come from single-particle density matrix purication schemes used in ground-state electronic-structure calculations. 70,71 In addition to addressing the computational performance and scalability of the method, this chapter also reports the tremendous eciency of the truncated weylet basis at very large N .

12

CHAPTER II ROVIBRATIONAL SPECTROSCOPY CALCULATIONS OF Ne2 2.1 Introduction Weakly bound molecular systems exhibiting large amplitude motion or oppy behavior have been of longstanding amongst theorists and experimentalists, despite technical challenges caused by low dissociation energies, large bond lengths, and substantial anharmonicities. For theoreticians interested in computing exact rovibrational spectra for such systems, these challenges manifest as extremely costly direct matrix diagonalizations. 16 Two basic strategies have been developed to deal with this situation: (1) optimize the representational basis set for the particular system, in eect directly reducing the matrix size, N ; (2) apply iterative methods which, due to sparsity or other reasons, need not store the complete matrix. Both of these approaches are directly amenable to the most convenient and commonly used choice of basis representation, i.e., direct product basis sets (DPBs) where the basis functions are separable products in the coordinates. 14,31 In particular, the discrete variable representation method (DVR) 5762 is a popular conguration space grid representation based on DPBs. The potential-optimized DVR methods, 72,73 including the maximally-ecient variety, the phase space optimized DVR, 29,74 are, as implied by the name, examples of (1) above, whereas the sparsity of the multidimensional DVR matrix representation of the Hamiltonian implies that these are also ideally suited to (2) above. 64,75,76 Despite such advances, all DPB and associated DVR methods are still characterized by exponential scaling with respect to the number of degrees of freedom

Reproduced with permission of the American Institute of Physics from Rovibrational spec-

troscopy calculations of neon dimer using a phase space truncated Weyl-Heisenberg wavelet basis by R. Lombardini and B. Poirier. Journal of Chemical Physics, Vol. 124, pp. 144107 (with minor alterations and additions). Copyright 2006 by the American Institute of Physics.

13

(DOFs). 24,29 In other words, N ecf K (2.1)

where K is the number of eigenvalues computed to a given accuracy level, and f is the number of DOFs. The positive exponent c can be minimized via DPB optimization, but never reduced to 0. Eliminating exponential scaling requires a non-DPB method. In a recent series of papers, 3537 one of the authors has introduced a promising new non-DPB method based on symmetrized orthogonal Weyl-Heisenberg wavelets, or weylets. Though oering many of the advantages of a DPB, when combined with a phase space truncation scheme (Sec. 2.2.1), the weylet representation can be shown not only to defeat exponential scaling, but also to approach perfect eciency (K/N 1) in the large N limit, regardless of dimensionality. It has already been used to extend direct matrix eigenvalue calculations for model systems to 15 DOFs and beyond. 37 Borrowing from the basis truncation scheme used here, another exact quantum method has recently been developed and applied to similar model systems; 63 however it does not satisfy perfect asymptotic eciency. Neither of the two methods described above has been previously applied to real molecular applications, due primarily to diculties in representing an arbitrary potential energy operator in the truncated basis representation. In this chapter, we present an ecient numerical method for achieving this important goal in the case of a weylet representation. Although formally the resultant weylet potential matrix is in general dense, in practice, the optimized phase space locality of the weylet basis ensures that many matrix elements are essentially zero. Since the multidimensional weylet kinetic energy matrix is also formally sparse, this enables the use of sparse iterative matrix techniques, 77,78 thus greatly increasing the matrix sizes N that may be considered. This is especially important for higher dimensional calculations, for which it has been observed that K increases much greater than linearly with N . 37 For simplicity, however, only direct matrix diagonalization methods are employed here.

14

In this chapter, we apply the weylet method for the rst time to a real molecular application, using the new potential matrix element evaluation technique (discussed in more detail below). In particular, the bound rovibrational energy levels of Ne2 in its ground electonic state ( 1 + )the simplest neon cluster systemare computed g in the full 3 DOF Cartesian coordinate representation. The dimensionality is not especially large; however, the goal is to demonstrate feasibility of the weylet approach, in anticipation of future, more challenging applications. The calculation is, at any rate, of non-trivial diculty, owing to the weakly-bound and long-range character of the interaction. Van der Waals complex systems such as rare gas clusters have gained much attention in recent years. Statistically, clusters (even with as few as seven atoms) 79 exhibit a coexistence of phases over a range of system temperatures and energies, serving as a prototype for bona de bulk matter phase transitions, and also solvation. Dynamically, the long-range but small-magnitude van der Waals and dispersion forces involved result in very interesting behavior totally unlike traditional covalently-bonded moleculesalthough they are now known to play an important ancillary role even for covalent systems. In particular, serving as the ultimate oppy/anharmonic molecules, clusters are not well described by the conventional equilibrium geometry/normal mode analysis, and thus require exact quantum treatment for their elucidation. This has motivated the development of accurate ab initio 80,81 and semi-empirical 8287 potentials, and a number of experimental studies. 82,88,89 In particular, the Aziz potentials (semi-empirical) have had much success in reproducing macroscopic and microscopic properties close to experiment for dilute neon gas, 85,86 as well as highly compressed neon solid. 87 From a dimensional scaling standpoint, neon clusters are also useful as a purely computational benchmark. In particular, the near-pair-wise nature of the interaction renders it quite convenient to expand the dimensionality simply by adding more neon atoms without the need to develop a full-blown potential energy surface (PES) beyond Ne2 , or perhaps Ne3 . However, as the main focus of the present work is to establish 15

feasibility of the weylet method for real molecular systems, only the dimer will be considered here, using a simple Lennard-Jones (LJ) model. The resultant computed rovibrational energy levels may be directly compared with those of a previous Cartesian calculation. 15 Though not quite as accurate as the Aziz potential, the simple LJ model does provide semi-quantitative accuracy for rare gas cluster systems, as amply demonstrated by previous theoretical investigations. 9093 The Cartesian coordinate aspect of the present study bears discussion. Despite the reduction in dimensionality obtained by separating vibrational and rotational motions, the Cartesian rovibrational approach oers many computational advantages, 15 particularly with respect to dimensional scaling investigations relevant to clusters. However, the primary motivation for the present weylet study is simply that weylet basis sets have not yet been dened for non-Cartesian coordinates 3537 (although previous work by Johnson and coworkers 94 strongly suggests that such a generalization can be achieved). Note that the weylet approach is not limited to cluster systems; indeed, in many respects, such systems constitute a worst-case scenario for a weylet representation, owing to concave phase space regions that favor a more traditional ane wavelet approach, 9496 and to shallow potential wells that present relatively small regions of available phase space. The new potential matrix element evaluation method is essentially a GaussHermite quadrature scheme, 16 exploiting the fact that each weylet basis function can be explicitly decomposed as a sum of Gaussians. Since the product of two Gaussians is also a Gaussian, one can use standard Gauss-Hermite quadrature techniques to evaluate the requisite potential energy integrations. Since a potentially large number of operations is involved, especially for the multidimensional case, a number of tricks are introduced to reduce to a bare minimum the computational (CPU) effort required to set up the matrices. For simplicity, and because the dimensionality considered here is only three, one trick that is not implemented is the sequential summation and truncation idea described in Ref. [37].

16

The remainder of this chapter is organized as follows. Section 2.2 will briey summarize the development of the phase space truncated weylet approach, as documented more thoroughly in Refs. [36] and [37]. Section 2.3 discusses in detail the application of the method to both a 1 DOF radial implementation, and a full 3 DOF Cartesian version for dimer systems. A majority of the description involves the explicit recipe used to generate Hamiltonian matrices in the weylet representation. Section 2.4 presents results, followed by discussion. 2.2 Theoretical Background 2.2.1 Phase Space Truncation The starting point of the phase space truncation scheme is the uniformly mixed ensemblethe projection operator K spanned by the K lowest energy eigenstates, |i , of the system Hamiltonian, i.e.,
K

K =
i=1

|i i | .

(2.2)

It can be shown 29,34 that the Wigner-Weyl phase space representation 32,33,97,98 of K is approximately given by K (q1 , ..., qf ; p1 , ..., pf ) [Emax H(q1 , ..., qf ; p1 , ..., pf )] (2.3)

(where H is the classical Hamiltonian), with the accuracy increasing in the large K limit. 34 The parameter Emax is chosen such that the associated phase space region, R (dened by K = 0), has volume K(2)f in h = 1 units (as will be presumed throughout this chapter). For the present purpose, the |i are taken to be the set of all rovibrational bound states of Ne2 i.e., Emax = 0 is the dissociation threshold [Fig. 2.5]. The basis set chosen to represent the system Hamiltonian for a given calculation also corresponds to its own projection operator and associated phase space region, R , with volume N (2)f . The challenge is to modify the representational basis such that the region R is as small as possible, but still completely encloses the desired region R. 17

One way to achieve this is to discard individual basis functions whose contributions to R do not overlap R. This is the essence of the phase space truncation scheme. In this capacity, the Weyl-Heisenberg wavelets of Wilson and Daubechies, 55,56 as modied by one of the authors, 3537 constitute an ideal choice of representational basis. In essence, each wavelet basis function corresponds to a single 2f -dimensional phase space cube of volume (2)f , which collectively comprise a rectilinear lattice. The underlying weylet basis set is the same for all applications; however, the resultant phase space truncation is system-dependent, as it is determined by R. In eect, one attempts to sculpt the region R out of the cubical blocks with which it intersects [Figs. 2.1 and 2.5]. Only those blocks on the periphery, which overlap only partially with R, lead to wasted space/ineciency, i.e. N > K. In practice, the above picture is complicated somewhat by technical details. For instance, the parameter Ecut > Emax is used to eect the truncation, rather than Emax itself, to incorporate tunneling. Moreover, the region overlap condition might be dicult to apply in practice, and is therefore replaced with the simpler truncation criterion Hmid < Ecut , where Hmid is the value of the classical Hamiltonian at the center of the weylet cube. 2.2.2 Weylet Basis Set The weylet basis derives from Weyl-Heisenberg coherent states (or phase space Gaussians), which in 1 DOF are given by gqp (x) = eiqp/2 eipx g(x q). (2.4)

The values (q, p) denote the phase space center of the coherent state gqp , with the origin-centered g00 = g representing the single ducial or mother state, from which all others are generated via phase space translation. Of particular interest is the subset of coherent states that form a lattice on phase space,
(d) gmn (x) = eiqm pn /2 eipn x g00 (x qm ),

(2.5)

18

where qm = (m/a) 2/d and pn = na 2/d comprise the lattice sites, and the indices m and n are for the moment taken to be integers. The unit-dependent quantity a is related to the aspect ratio of the lattice, and d is the density of phase space lattice sites measured in units of (2)1 . The value d = 1 denotes critical density, as in Sec. 2.2.1. At critical density, one can construct Eq. (2.5) lattices that comprise complete orthonormal basis sets. 41,42,55,56 From among these, the optimally phase space local(1) ized basis, denoted fmn (x), has been identied. 36 Good phase space localization of

the weylet basis is essential for ecient truncation. Unfortunately, however, even
(1) fmn (x) is insuciently localized, due to the Balian-Low no go theorem. 53,54 How-

ever, a solution to this dilemma has also been discovered by Wilson 55,56 involving the doubly-dense tight frame lattices, whose optimally phase space localized representa(2) tive fmn has also been identied, 36 and found to decay exponentially quickly outside of (2) the corresponding square phase space region. The fmn functions themselves are over-

complete, but a particular momentum-symmetrized linear combination of the fm(n) pair yields a complete orthonormal basis with exponential decay. The Wilson construction requires integer values of n, and an awkward special procedure for the n = 0 case [Fig. 2.1(b)]. In addition, the construction of a suitable doubly-dense tight frame lattice is numerically unstable, despite improvements by Daubechies. 56 Recent developments by one of the authors 35,36 alleviates both of these major diculties. In particular, the n = 0 problem is addressed by shifting the lattice by one-half unit in the momentum direction, so that n is now restricted to half-integer values. A Wilson-type construction is then applied, resulting in a complete orthonormal basis for which all basis functions are treated on an equal footing [Fig. 2.1(c)]. The second problem has been resolved via a new algorithm 36 that has been used to obtain an extremely accurate Gaussian expansion of f00 , i.e., f00 (x) =
m,n=mmax (2) mmax (2) cmn gmn (x), (2)

(2)

(2.6)

19

with g00 (x) = g(x) = (a2 /)1/4 e(a


(2)
2 x2 )/2

(2.7)

Several comments on Eq. (2.6) are in order. First, the meaning of the indices (m, n) have now changed; from here on out these are regarded as summation indices. Second, the (m, n) values are integers, and in fact, both must be even integers (the odd-valued coecients all vanish 36 ). Third, although the summation is in principle innite over all even m and n, in practice the expansion coecients cmn (presented in Refs. [35] and [36]) decay exponentially in |m|+|n|, due to exponential phase space localization. Consequently, in practice, only the truncated summation given in Eq. (2.6) need be considered. Finally, it should be noted that the cmn , and f00 (x) itself, are real-valued and symmetric, even for nite expansions. 3537 2.3 Matrix Representations and Numerical Implementation 2.3.1 Momentum-Symmetrized Gaussian Expansion Note that the unsymmetrized weylet ducial state, f (x) = f00 (x), as determined from Eq. (2.6) above, does not itself belong to the phase space lattice, because the present scheme requires a half-integer-valued momentum index. However, we can shift both the ducial weylet itself, and the underlying lattice of expansion Gaussians, using Eqs. (2.4) and (2.5), and the analogs for f . This gives rise to the doubly-dense unsymmetrized weylet lattice, fst (x) = eist/2 eita
(2) x (2) (2)

f (x s /a),

(2.8)

for which individual lattice functions are now labeled with new indices, (s, t). Clearly, t must be a half-integer. In contrast, s {..., 1, , + 1, ...}, where can be any

real value. Note that the underlying lattice of expansion Gaussians is coincident with the weylet latticei.e., the Gaussian centers are located at the same locations in phase space as the weylet centers. Although the allowed index values are thus the same for the weylet and Gaussian lattice functions, it is convenient to use a dierent

20

set of indices, (u, v), to label the latter. This is particularly useful for expansion summations, in which context u = s + m and v = t + n. In practice, the symmetrized weylet basis is most relevant, given explicitly as 37 st (x) = 2 sin ta x t(s + 1/2) f (x s /a).
(2)

(2.9)

Note from Eq. (2.9) that st (x) is manifestly real-valued, althoughunlike fst (x) asymmetric about x = (m/a) . Plots of several st (x)s are presented in Fig. 2.2. Although in principle, one can obtain st (x) by rst summing Gaussians [to obtain fst (x)] and then symmetrizing, it is in practice more convenient to perform these operations in the opposite order, as the intermediate functions (and matrix elements) are then real-valued, and there are far fewer of them (two times fewer in 1 DOF). We thus introduce the momentum-symmetrized Gaussian (SG), uv (x) = 2 sin va x v(u + 1/2) g(x u /a) (2.10)
(2)

(also referred to as a modulated Gaussian), in terms of which the symmetrized weylets are obtained directly from the following summation:
mmax

st (x) =
m,n=mmax

n (1)( 2 +mt) cmn uv (x).

(2.11)

Note that Eq. (2.11) (and all subsequent equations) assume that m and n are even integers, and t is a half-integer (always positive for momentum-symmetrized functions). 2.3.2 Matrix Representation: 1 DOF Case (Radial Ne2 ) We rst consider a simple 1 DOF calculationi.e., the radial Ne2 problemin anticipation of the extension to the higher dimensional case. The van der Waals interaction of Ne2 is modeled by the LJ potential, V (r) = 4 r
12

(2.12)

21

where = 24.743267 cm1 is the well depth, r is the separation between the two atoms, and = 5.1950000 a.u. yields an equilibrium separation of re = 21/6 = 5.8311903 a.u. (Fig. 2.3). The only coordinate, r > 0, is radial rather than Cartesian. Although the present weylet formulation applies only to Cartesian systems, we convert the radial Ne2 problem into a Cartesian one by extending r to , making V (r) a symmetric function, and modifying it in the vicinity of r = 0 to obviate the singularity. Physically, the negative values of r can be thought of as the two neon atoms in opposite places as compared to the situation with positive r. Obviously, the physics is identical for r on either side of 0, and one can think of the extended V (r) as a symmetric double-well potential. The singularity is replaced with a symmetric Gaussian cap in the region |r| < rcut . When the centrifugal contribution is also included, the resultant eective potential becomes
4
12 6

Ve (r) =

r
2

J(J+1) 2r2

|r| rcut |r| < rcut

J eJ r + J

(2.13)

where J is the rotational quantum number, and = mNe /2 is the reduced mass (mNe = 20.180000 a.u. is taken as the atomic mass of Ne). The parameters J , J , and J are chosen so that Ve (r) and its rst two derivatives are continuous at r = rcut [Fig. 2.4]. The only adjustable parameter is therefore rcut which directly aects the height of the caps peak (the smaller rcut , the higher the peak). By increasing the peaks height, the eigenvalues converge to the correct energy values rendering the articial addition to be insignicant to the potential interaction of the system. The parameter rcut needs to be close enough to 0 such that there is enough of the repulsion region of the LJ potential to considerably damp the amplitudes of the bound states wave functions leaving their transmission into the Gaussian portion to be negligible. If it is 22

too close, then the potential matrix elements become innite. Fortunately, a balance can be achieved which is demonstrated in Sec. 2.4. The cap parameters are listed in Table 2.1 for the resultant rcut = 4.6 a.u. value. The 1 DOF Hamiltonian, H, with the Cartesian modication described above, takes on the usual kinetic-plus-potential form, T + Ve . In the symmetrized weylet representation, therefore, the matrix elements become st |H|s t or H
s,t,s ,t

= =

st |T |s t + st |Ve |s t T
s,t,s ,t

+ Ve

s,t,s ,t

(2.14)

Using Eq. (2.11), the weylet matrix elements for any observable A can be obtained from the corresponding SG matrix elements as follows: A
mmax s,t,s ,t mmax

=
m,n=mmax m ,n =mmax

(1)

n +mt+ n 2 2

+m t

cmn cm n A

u,v,u ,v

(2.15)

For the kinetic energy contribution, the SG matrix elements are analytical: T with h(u, v, u , v ) = e 4 (v +u )
2 2

u,v,u ,v

a2 [h(u, v, u , v ) h(u, v, u , v )] , 4

(2.16)

1 2 2 v+ u2 + cos u v+ sin . 2

(2.17)

The subscript indicates the dierence between the bra and ket indices, e.g. u = u u . The + subscript denotes the addition of indices, v+ = v + v . Note that under a change of sign of v [i.e., for the last term of Eq. (2.16)], v becomes v+ , and vice-versa. The phase quantity is given by (u, v, u , v ) = (u v+ + v ). 2 (2.18)

Note that since u , v , and v+ are always integer-valued, the trigonometric quantities in Eq. (2.17) are always 1 or 0. The SG matrix elements for the potential energy can be written as Ve
u,v,u ,v
2 1 = e 4 (u ) [b(u, v, u , v ) b(u, v, u , v )] ,

(2.19)

23

where b(u, v, u , v ) = and = ar ( /2)u+ .


cos v Ve

u+ 2

d,

(2.20)

2.3.3 Numerical Implementation: 1 DOF Case (Radial Ne2 ) In this section, we present a numerical recipe that can be followed to implement the weylet scheme in 1 DOF. Various convergence parameters, related to basis set expansions and quadrature integrations, are also introduced and explained. The numerical implementation is divided into four main steps, each of which will be addressed in turn: 1. Truncate representational st basis, using phase space approximation or related means. 2. Compute all necessary Eq. (2.20) potential energy integrations. 3. Construct H , the Hamiltonian matrix in the symmetrized weylet representation. 4. Diagonalize H to compute eigenstates. Step 1 involves the determination of which st basis functions will actually be used to represent H. For most applications, the simple phase space truncation criterion described in Sec. 2.2.1 would be utilized (i.e., st is discarded if H(qs , pt ) > Ecut ), although a slightly dierent scheme is used in Sec. 2.4.1. To increase the basis size N and improve upon the eigenvalue accuracy, one can simply increase Ecut . Although, for the Ne2 system, two problems come about from this scheme. First, the surface, H = Emax , has a tail extending out to innity in the position direction [Fig. 2.5]. If one raises the energy parameter past a certain value, i.e., set H = Ecut where Ecut > (a2 )/(8), the new surfaces tail converges right above the centers of the rst row of weylet blocks along the positive position axis. Thus, the Hmid < Ecut criteria 24

includes an innite number of blocks. Secondly, as Ecut increases, the signature hump, coming from the well of Ve (r), attens along the new surface, thus not retaining the shape of the original H = Emax surface. In addition to the truncation convergence parameter Ecut , the determination of the weylet basis functions themselves also presents two adjustable parameters, albeit of lesser importance. These are the position oset , and the aspect ratio parameter, a. Again, the phase space picture may be employed to select nearly optimal, or at least suitable, values. In Step 2, tables are constructed of all necessary integrations of the Eq. (2.20) form. Note that apart from the phase shift quantity , the integrals depend only on the index combinations v and u+ . Since is a multiple of /2, it serves only to determine the sign of the Eq. (2.20) integral, and whether a sine or cosine function is used. Only two integration tables are therefore needed, B s for the sine case, and B c for the cosine case, both of which are two-dimensional arrays. The corresponding table elements are as follows: [B c ]u+ ,v = [B s ]u+ ,v =

e e

cos(v )Ve sin(v )Ve

a a

u+ 2

d d.

(2.21) (2.22)

u+ 2

The new variable greatly facilitates numerical evaluation of the above integrals, for which it is natural to employ Gauss-Hermite quadrature. 16 This introduces a new convergence parameter, Q, representing the number of quadrature points used to evaluate each integration. The basis set truncation of Step 1, together with the expansion truncation coefcient mmax , impose limits on the ranges of the u+ and v values needed for the B c and B s tables. Further limits may also be imposed, however, due to the fact that many of the integrations are nearly zero, or redundant, and may thus be disregarded. In particular, one need only consider 0 u+ u+ max ; 25
0 v vmax .

(2.23)

Although in general u+ and v may be positive, negative, or zero, the even/oddness of the integrands in Eqs. (2.21) and (2.22) imply that the negative case is identical to a corresponding positive integral (apart from a possible sign change). For B s , the zero case may also be ignored (either u+ = 0, v = 0, or both), as Eq. (2.22) clearly results in a vanishing integral. It can be shown that the integration absolute values decrease
with increasing v and u+ , thus justifying the table truncation parameters vmax and

u+ , quite independently of basis set truncation issues. In particular, increasing v max leads to a more oscillatory integrand in Eqs. (2.21) and (2.22), whereas increasing u+ places the Gaussian center further towards the asymptotic region where Ve 0. Step 3 involves the construction of T and Ve . First, however, a determination must be made of the set of uv SG functions collectively required to expand the truncated st basis functions from Step 1. The required (u, v) values will include all of the truncated (s, t) values, plus all those within a band of width mmax around the original phase space region. For all calculations performed here, mmax 6 is used, which for the equality contributes a relative error on the order of 106 . In practice, the exponential decay of cmn with respect to |m| + |n| can be fully exploited by replacing the square summation
mmax m,n=mmax

in Eq. (2.11) and subsequent expressions, with the


|m|+|n|mmax . 35,36

correlated diamond summation,

This reduces the computational

eort by a factor of two for each such summation encountered, and also reduces the set of required (u, v) values. Once the appropriate set of expansion SGs, uv , has been established, construc tion of T is straightforwardi.e., one merely applies Eqs. (2.16) and (2.17), and the Eq. (2.15) summation (with diamond truncation applied to both primed and unprimed indices). The Ve matrix construction proceeds similarly, except that

Eqs. (2.19) and (2.20) are used, and the b(u, v, u , v ) values of the latter are ob tained using tables B c and B s , with appropriate sign factors. Note that Ve
u,v,u ,v

depends on u , in addition to u+ and v . Moreover the u dependence drops o as

26

a Gaussian, meaning that Ve

u,v,u ,v

can be simply set to zero except when (2.24)

|u | u , max

where u is a new convergence parameter. This results in reduced computation, max and increased sparsity for the nal matrix H , obtained by adding together T and V . The nal Step 4 is the most straightforward, and for large systems, also the most computationally expensive: the real symmetric Hamiltonian matrix H is diagonalized numerically, to compute energy eigenvalues, and possibly wave functions. For purposes of this chapter, only direct diagonalization routines are considered; however, in principle, sparse eigensolvers may also be used, as will be explored in future publications. For the Ne2 dimer application, collectively there are nine parameters involved:
rcut , a, , Ecut , Q, mmax , u+ , vmax , and u . Convergence with respect to most of max max

these is fairly decoupled, although there are certain correlated combinations such as (a, Ecut ) and (Q, rcut , a), with any of the last three of the nine listed above. These correlations become increasingly important in higher dimensions. For example, a < 1 (depending on the units chosen) corresponds to at, wide weylet blocks. A larger number of these would be required to span the requisite region of momentum space, than for narrower weylet blocks. This requires the use of SGs that have high indices v or v which reect high frequencies of oscillation in position space. Ultimately, this produces the need for more quadrature points Q in the integration. In the contrapositive situation, if Q is chosen to be low, then one needs a > 1 or tall blocks in order to avoid the need for SGs of high frequency. There is also a mutual dependency between Q and rcut . The choice of Q can aect the positioning of the quadrature points along the position axis. At high Q, there can be too many quadrature points bunched inside the region of the articial potential cap negatively aecting the overall accuracy of the eigenvalues. To counter this, choosing a small rcut would limit the extent of the articial cap along the position axis. On 27

the other side, a large rcut would require Q to be reduced. Thus, a balance between the two parameters needs to be achieved for convergence of eigenvalues. 2.3.4 Matrix Representation: 3 DOF Case (Cartesian Ne2 ) For 3 DOF Cartesian calculations of the neon dimer, the LJ potential is the same as Eq. (2.12), except that r is taken to be r = x2 + y 2 + z 2 , with (x, y, z) the three Cartesian coordinates describing nuclear separation. As in the 1 DOF case however, an articial Gaussian cap is introduced in the singular region, so that the resultant V (x, y, z) corresponds to Eq. (2.13) with J = 0. The 3 DOF Cartesian Hamiltonian operator is therefore 1 H= 2 2 2 2 + 2+ 2 x2 y z + V (x, y, z). (2.25)

The 3 DOF orthonormal symmetrized weylet basis functions are simply products of the corresponding 1 DOF weylets, i.e.,
3

s1 t1 s2 t2 s3 t3 (x1 , x2 , x3 ) = s,t (x) =


j=1

sj tj (xj ),

(2.26)

where x = (x, y, z) = (x1 , x2 , x3 ), etc. As a result, obtaining multidimensional matrix elements is in principle very straightforward. In particular, given the separable form, T =
3 j=1

Tj , of the 3 DOF Cartesian kinetic energy, the corresponding matrix

elements are given as T


s,t,s ,t

=
(i,j,k)=(1,2,3)

Ti

si ,ti ,si ,ti

sj sj tj tj sk sk tk tk .

(2.27)

where the summation is over all cyclic permutations of (1, 2, 3), is the Kronecker delta function, and Ti the sparsity of T is comparable to that of a discrete variable representation. 5762 As for the potential energy matrix V , the 3 DOF version of Eq. (2.15) applied to V is V
3 s,t,s ,t si ,ti ,si ,ti

are 1 DOF matrix elements from Sec. 2.3.2. Note that

=
||(m,n,m ,n )||1 mmax j=1

(1)

nj 2

+mj tj +

n j 2

+mj tj

cmj nj cmj nj V

u,v,u ,v

(2.28) 28

where ||(m, n, m , n )||1 =

3 j=1

|mj |+|nj |+|mj |+|nj | is the composite vector 1-norm.

The form of the single, correlated summation bears comment. A literal application of Eq. (2.15) would result in six square summations of two indices eachi.e., a hypercubical summation in 4f = 12 dimensions, involving (mmax + 1)12 summand terms. The multidimensional generalization of diamond truncation would reduce this to two correlated summations, 37 one each for the primed and unprimed indices. However, the exponential decay of the cmn indices, together with the product form of the summand in Eq. (2.28), imply that a further correlation across primed and unprimed indices may also be applied. This results in the above simplicial summation scheme, for which the number of summand terms is reduced by something like 12! [or (4f )! in general], thus avoiding exponential scaling. Equation (2.19) generalizes to a sum of eight terms, V
u,v,u ,v 1

1 =

e 4 u u

(1)k1 +k2 +k3 b(u, v, u , (1)k v ),

(2.29)

k1 ,k2 ,k3 =0

where (1)k v = ((1)k1 v1 , (1)k2 v2 , (1)k3 v3 ), and

b(u, v, u , v ) =

cos vj

+ j j V

u+ 2

j=1

d1 d2 d3 . (2.30)

The other quantities above are dened as follows: u = (u u ); v = (v v ); u+ = (u + u ); = (ax


u+ ); 2 + j = (/2)(u vj + vj ). j

2.3.5 Numerical Implementation: 3 DOF Case (Cartesian Ne2 ) For the most part, the 1 DOF numerical recipe provided in Sec. 2.3.3 generalizes in straightforward fashion to the 3 DOF case. However, insofar as the there are substantial dierences, these will be addressed here. Regarding Step 1, there are in principle not one, but three dierent aspect ratio parameters, ax , ay , and az . Given the spherical symmetry of the rovibrational Ne2 system however, it is clear all three should have the same value in this case. Similar comments apply to the position oset parameter, . Regarding weylet basis truncation, the simple Hmid < Ecut would be suitable for most applications, but does not 29

work especially well in the present case, for reasons described in Sec. 2.4.2, where an alternate prescription is described. For Step 2, all of the time-saving tricks from Sec. 2.3.3 may be applied, as well as additional symmetry considerations. For instance, the Eq. (2.30) integration is invariant with respect to permutations of the three components, (x, y, z), which can be exploited to reduce CPU time and storage. This is particularly important, given that the 3 DOF integrations now involve Q3 , rather than Q, quadrature points. Note that each of the j quantities in Eq. (2.30) independently determines whether its corresponding sinusoidal factor is a sine or a cosine. Consequently, there are in principle not two, but 2f = 8 dierent integration tables analogous to Eqs. (2.21) and (2.22). However, permutation symmetry enables us to reduce these to just (f +1) = 4 tables, B c3 , B c2 s , B cs2 , and B s3 , with obvious notation, e.g. [B c2 s ]u+ ,v =

u+ 2

e cos v1 1 cos v2 2 sin v3 3 d1 d2 d3 . (2.31)

Permutation symmetry can also be applied to the individual tables themselves, resulting in a f ! = 6-fold reduction in the number of B c3 and B s3 table elements that must be computed, and a (f 1)! 1! = 2-fold reduction for the B c2 s and B cs2 tables. Symmetry can also be used to restrict the range of relevant table values beyond Eq. (2.23), although in this context, it is rotational rather than permutation symmetry that is responsible. Thus, in addition to 0 u+ u+ j max ;
0 vj vmax

for all j,

(2.32)

one may also apply spherical truncation, |u+ | u+ max ;


|v | vmax ,

(2.33)

where the vector lengths are now computed in the usual 2-norm sense. The validity of Eq. (2.33) is established in Appendix A. Note that as in the 1 DOF case, the 30

u+ = 0 and/or vj = 0 table elements are identically zero when j corresponds to a j

sine function in the Eq. (2.30) integrand. Finally, we comment that as in the 1 DOF case, the Eq. (2.29) integral decreases extremely rapidly with u u = |u |2 . As discussed in Appendix A, the result is that V
u,v,u ,v

may be taken to be zero except when |u | u . max (2.34)

As in the 1 DOF case, this leads to reduced computation and increased sparsity, although the eect is much more pronounced for higher dimensionalities. 2.4 Results and Discussion 2.4.1 Results for Radial Ne2 (1 DOF Case) As discussed previously, the simple H(qs , pt ) Ecut basis truncation criterion works extremely well for most molecular systems, 3537 but is expected to be less efcient for Ne2 . In other respects as well, weakly-bound systems with long-range interactions present a worst-case scenario for the present weylet approachthus providing another solid (if slightly perverse) motivation for the present study. The reasons for this are two-fold. First, the weakly-bound aspect implies that K is small even up to the dissociation threshold, thus placing us far from the large K limit where K/N 1. Indeed, Ne2 has only two vibrational levels. Second, the long-range interaction implies concave, rather than convex, phase space regions for suciently high Emax thus favoring conventional ane wavelets over weylets. 3537,94 As mentioned before, with Emax at the dissociation threshold as is the case here, any Ecut > Emax results in a phase space region in the continuum, with innite extent (although in practice, this need not cause a diculty.) To ameliorate the above situation somewhat, and because the 1 DOF calculations are so inexpensive, we have opted for a more labor-intensive, but rigorously optimal basis truncation scheme. Specically, a weylet is discarded if the resultant computed vibrational eigenvalues for J = 0 agree with those of a more accurate reference cal31

culation 15 to within some desired accuracy. Weylet block pairs are thus whittled away from an initial large rectangular lattice, starting from the top and bottom rows and working inwards towards the q axis. For the particular aspect ratio parameter value a = 1.6 a.u. (choice explained below), this procedure was applied to Ne2 at an accuracy level equal to 2% of the well depth, ( 0.4949 cm1 ), and again for .2% and .02% of . The resultant truncated basis sets are indicated schematically in Fig. 2.5. Note that even for the present worst-case application, the resultant pattern of blocks conforms roughly to the H(q, p) = 0 region, thus validating the phase space truncation idea. The resultant computed eigenvalues are presented in Table 2.2. To optimize with respect to the other two weylet basis parameters, a and , one can repeat the above procedure for many dierent values, and determine which yields the smallest basis size N . We have performed such an optimization for a, but have simply taken = 1/2 throughout (resulting in half-integer values for both s and t

indices). For the 2% accuracy calculation, this led to the optimal choice a = 1.6 a.u., resulting in a basis size N = 10. The above studies were performed primarily to assess basis eciencies associated with a given level of accuracy (the three benchmark values chosen above correspond to those used in previous calculations 3537 ). However, we have also performed a much more accurate reference calculation, using N = 133 weylets with regions lying within the phase space rectangle 0 x 21 a.u., 20 p 20 a.u. This basis was used for all J values that support bound levels (J 9), resulting in a determination of all rovibrational bound states to an accuracy of 104 cm1 or better. The results are presented in Table 2.4, Column 3. The remaining convergence parameters were chosen with a view towards ensuring that these do not contribute appreciably to numerical error, even for the highest
accuracy reference calculation. In particular, the parameter values u+ = 30, vmax = max

8, and u = 5 were found to be converged to 107 cm1 or better for both computed max eigenvalues. Similarly, Q = 100 led to 106 cm1 convergence. The dependence on rcut was also found to be very insensitive, with both eigenvalues changing by only 32

2 104 cm1 over the range 3.8 rcut 5.0. The value rcut = 4.6 a.u. was used for all subsequent calculations. 2.4.2 Results for Cartesian Ne2 (3 DOF Case) The 3 DOF Cartesian calculations of the Ne2 rovibrational states make no attempt to exploit rotational symmetry or degeneracyunlike a previous calculation. 15 As a consequence, the required basis size N is much too large for the optimal quantum truncation scheme to be applied. Nevertheless, we can still exploit the results of Sec. 2.4.1 for the 3 DOF calculation, as described below. The rst step is to dene a trial phase space region in 1 DOF, |p| pmax (r), which encloses the centers of only the non-discarded weylet pair blocks corresponding to the optimal, 2% accurate, N = 10 basis from Fig. 2.5. The functional form pmax (r) = 1 (r )2 |r| e , 2 (2.35)

where the ve parameters allow for exibility of the shape: provides the vertical height, the location of the ellipse center along the position axis, the horizontal axis length, the degree of exponential decay, the position of the start of the exponential decay. The numerical assignments = 6.2, = 8.6, 2 = 24.6, = 0.186, and = 5.4 (all in a.u.), is found to ll the bill, i.e., the above parameter choice has been somewhat optimized to minimize the volume of the resultant phase space region, which is presented in Fig. 2.6. Once a suitable pmax (r) is constructed as above, radial symmetry is used to extrude this region into the full six-dimensional phase space of the Cartesian system, by replacing p |p|, and r |x|. Only those multidimensional weylets whose centers lie within the new region are retained for the 3 DOF calculationi.e., those that satisfy |pt | pmax (|rs |). Using the same a and values as in the 1 DOF case, this

results in a 3 DOF basis of N = 3480 weylet functions. The above basis was anticipated to yield rovibrational energy levels within the desired 2% accuracy level, but in fact fell somewhat short of this goal. To improve 33

accuracy, the phase space region was enlarged via simultaneous variation of , , and . Also, in practice it was found that including the interior weylet functions (with centers in the potential cap region) substantially improves accuracy, at the cost of adding only around 12002500 additional basis functions. Table 2.3 indicates the resulting convergence of the two pure vibrational level energies. The largest of these calculations (N = 24 392) computed all 125 rovibrational states to within the desired 2% tolerance (using the fully converged 1 DOF calculations of Sec. 2.4.1 and Ref. [15] as reference). The computed energies are presented in Table 2.4. For the above calculations, all of the remaining parameters were converged to within 102 cm1 i.e., still substantially smaller than the basis truncation error. The values u = 4, max
vmax = 6, Q = 17, and rcut = 4.9 a.u. were used for all calculations (except for the

last row where u = 5 and Q = 18 were increased by one for better convergence), max whereas u+ was varied as per Table 2.3, Column 5. max Of the four computational steps described in Sec. 2.3.3, Step 3 was generally found to be the bottleneck, due to the Eq. (2.28) summation. The choice mmax = 6 results in 2625 summand terms per potential matrix element. In comparison, mmax = 4 and mmax = 2 require 313 and 25 summand terms, respectively. As the computed eigenvalues were found to dier only by around 0.02cm1 between mmax = 6 and mmax = 4, the latter was used for the results presented in Tables 2.3 and 2.4. To improve numerical performance, some eort was made towards nding the leanest (i.e., least time consuming) calculation that computes all vibrational levels to within the 2% tolerance. Using region parameters = 7.5, = 8.6, 2 = 35.0, = 0.160, and = 5.4, and shaving o some of the high-momentum weylets in the potential cap area, a suitable N = 10 896 basis was obtained. With the additional
parameter choices u+ = 19, vmax = 4, u = 3, Q = 10, and mmax = 2, the total max max

time (for all steps) required on a Compaq Alpha 1200 MHz CPU was found to be 1.2 hours, with 99/125 rovibrational levels computed to within the strict 2% tolerance, and the remaining 26 to a comparable level of accuracy.

34

2.4.3 Discussion For reasons discussed in Secs. 2.1 and 2.4.1, the present Ne2 application presents a worst-case scenario (apart from the low dimensionality) for the weylet basis approach. By any measure, even using more established optimized methodologies, 15 the 3 DOF Cartesian Ne2 system presents a very challenging numerical calculation. It is therefore reassuring to discover that the weylet approach is nevertheless competitive. In comparing with Ref. [15] for instance, one nds that the basis sizes and especially CPU times required to compute most of the 125 rovibrational states are greatly reduced; however, the computed eigenvalue errors are also substantially larger, to the extent that two-to-three digits of accuracy are lost. For the lean 3 DOF calculation, the resultant basis eciency K/N = 99/10 896 0.01 is not especially large. The corresponding eciency value for the 3 DOF isotropic harmonic oscillator system, for instance (Table IV. of Ref. [37]), is around 0.25 although the comparison is somewhat biased against the Ne2 case because of the way that accurate eigenvalues are counted. In any event, there are two important causes that underlie this eciency gap: (1) small and concave-shaped phase space region; (2) large singularity hole in the potential cap region, due to use of Cartesian coordinates, and to large Ne2 equilibrium separation. Note that quadrature error can denitely be ruled out as a major cause (Sec. 2.4). In practice, very few molecular applications should exhibit such low weylet eciencies in 3 DOFs, since any modication to the above (i.e. deeper well-depth, shorter-range interaction, smaller equilibrium separation, or use of non-Cartesian coordinates) would greatly improve performance. Eciencies for typical systemsespecially deeply-bound systems at energies substantially below dissociationare likely to be much closer to harmonic oscillator values. 37 Future eorts will investigate other molecular systems more amenable to the weylet approach, including those at higher dimensionalities. In this regard, even larger neon clusters such as Ne3 are much more favorable than Ne2 , apart from the increased dimensionality. The reason is that for purposes of studying solvation, or 35

the liquid-solid phase change, the energy range of interest extends only up to the rst isomerization thresholdi.e., the energy of just one bondwhich for NeN >2 , is far below the dissociation threshold. Note that the simple Gaussian quadrature scheme as employed herethough shown to be remarkably eective for the three-dimensional application considered nevertheless reintroduces exponential scaling, in that the total number of quadrature points grows as Qf . The present procedure, however, is unnecessarily wasteful, in that no attempt is made to remove the corner region quadrature points. This could easily be achieved via spherical truncation, as can be justied using an argument similar to that of Appendix A. We did not bother to do so here because the integral evaluations comprised only a small fraction of the total CPU time. At high dimensionalities, quadrature integral evaluation could in principle become the computational bottleneck, although the spherical truncation remedy described abovewhich incidentally, is applicable even for non-spherically-symmetric potentialsreduces CPU eort exponentially with increasing f . Alternatively, if the desired accuracy level is not too high, Monte Carlo integration techniques could be usedapplied in phase space so as to replace oscillatory modulated Gaussians with ordinary Gaussians. Thus, the CPU cost associated with matrix initialization need not become the bottleneck at large dimensionalities. Further improvements and modications to the weylet method will also be explored, e.g. non-Cartesian coordinates, sparse matrix methods, and those discussed in Sec. IV C of Ref. [37]. Clearly, however, the most important area for improvement is increasing the level of accuracy that can be obtained with the weylet approach. Using projection operator methods (to customize weylet basis functions for given applications), we have recently taken large strides in this area, as will be reported in future publications.

36

Table 2.1. Parameter values for the Gaussian cap of Eq. (2.13), with rcut = 4.6 a.u. J 0 1 2 3 4 5 6 7 8 9 J (a.u.) 3.1600 3.1562 3.1485 3.1370 3.1218 3.1029 3.0805 3.0547 3.0257 2.9935 J (a.u.) J (104 a.u.) 0.3758 0.3757 0.3756 0.3753 0.3750 0.3746 0.3741 0.3736 0.3729 0.3722 -1.0607 -1.0406 -1.0005 -0.9404 -0.8603 -0.7601 -0.6400 -0.4998 -0.3397 -0.1597

Table 2.2. Ground and rst excited vibrational (J = 0) level energies for 1 DOF radial Ne2 , computed from quantum truncated weylet basis (Fig. 2.5) for three dierent target accuracy levels, measured in units of the well depth. The literature energy values (Ref. [15]) are 14.0245 cm1 and 2.6834 cm1 , respectively. % well depth 2% 0.2% 0.02% basis size N 10 26 36 ground (cm1 ) excited (cm1 ) -13.8710 -13.9789 -14.0200 -2.3519 -2.6346 -2.6786

37

Table 2.3. Convergence of computed ground and rst excited vibrational (J = 0) level energies for 3 DOF Cartesian Ne2 , with respect to increasing phase space region volume. Columns 13: parameter values used to specify phase space region in Eq. (2.35) (the values = 8.6 and = 5.4 are held constant). Column 4: resultant 3 DOF phase-space-truncated basis size, N (including weylets in potential cap region). Column 5: u+ value remax quired for convergence to within 102 cm1 . Columns 6 and 7: computed level energies; * indicates those lying within desired 2% error tolerance. 2 0.186 0.174 0.162 0.156 0.150 0.132 N 5200 7464 10 992 12 768 15 120 24 392 u+ max 18 19 20 20 20 22 ground (cm1 ) 13.0457 13.6977 13.8204 13.8581 13.8718 14.0288 excited (cm1 ) 0.1094 1.4897 1.8764 2.0180 2.1568 2.3599

6.2 24.6 6.8 29.4 7.4 34.2 7.7 36.6 8.0 39.0 8.9 46.2

38

Table 2.4. All 125 rovibrational bound states of Ne2 , as computed using the 3 DOF weylet basis of Table 2.3, Row 6. Parentheses indicate numerical degeneracies. Last two columns: corresponding reference 1 DOF results from Sec. 2.4.1 (Column 3) and Ref. [15] (Column 4). J 0 1 2 3 E (cm1 ) -14.029(1) -2.360(1) -13.695(2), -13.674(1) -2.152(2), -2.145(1) -13.042(2), -13.035(2), -13.026(1) -1.746(2), -1.740(2), -1.736(1) -12.091(1), -12.089(1), -12.086(2), -12.080(3) -1.149(1), -1.147(1), -1.146(2), -1.142(2), -1.137(1) -10.884(1), -10.854(2), -10.853(1), -10.849(3), -10.846(2) -0.378(1), -0.377(2), -0.370(1), -0.367(2), -0.362(2), -0.359(1) -9.400(2), -9.398(1), -9.348(2), -9.347(1), -9.346(2), -9.335(3) -7.664(2), -7.628(2), -7.626(1), -7.584(1), -7.579(1), -7.574(2), -7.561(1), -7.559(2), -7.556(1) -5.616(2), -5.611(1), -5.601(1), -5.595(2), -5.566(1), -5.544(2), -5.540(1), -5.530(2), -5.528(1), -5.519(2) -3.302(1), -3.300(2), -3.294(1), -3.270(2), -3.267(2), -3.264(1), -3.255(2), -3.242(1), -3.239(3), -3.237(2) -0.739(2), -0.736(1), -0.733(1), -0.726(2), -0.725(2), -0.720(1), -0.719(1), -0.718(2), -0.713(2), -0.709(1), -0.707(1), -0.698(2), -0.697(1) E (Sec. 2.4.1) -14.0245 -2.6834 -13.7213 -2.4922 -13.1165 -2.1143 -12.2129 -1.5595 -11.0148 -0.8455 -9.5286 -7.7628 E (Ref. [15]) -14.0245 -2.6834 -13.7214 -2.4922 -13.1165 -2.1143 -12.2129 -1.5595 -11.0148 -0.8452 -9.5286 -7.7628

5 6

-5.7290

-5.7290

-3.4427

-3.4427

-0.9264

-0.9264

39

Figure 2.1. Schematic indicating phase space partitionings associated with various weylet basis sets in 1 DOF. Dots represent phase space centers (q, p) for unsymmetrized [i.e., f (x)-type] weylet functions. A single symmetrized [(x)-type] weylet is indicated by the : (a) critically dense weylets, (1) = f22 (x); (b) doubly-dense Wilson-Daubechies weylets, = 22 (x); (c) doubly-dense weylets of present work, = 3 3 (x).
22

40

1.5 1.0 0.5

- 10

-5 - 0.5 - 1.0 - 1.5

10

15

20

25

1.5 1.0 0.5

- 10

-5 - 0.5 - 1.0 - 1.5

10

15

20

25

1.5 1.0 0.5

- 10

-5 - 0.5 - 1.0 - 1.5

10

15

20

25

Figure 2.2. Plots of six dierent symmetrized 1 DOF weylets, st (x) vs. x, for a = 0.5 a.u. Each plot corresponds to a dierent t value as indicated, with both s = 1/2 (left) and s = 11/2 (right) weylets represented. Larger t values are associated with increasingly oscillatory behavior.

41


30 20

10 6 7 8 9 10 11

- 10 - 20 - 30

Figure 2.3. The Lennard-Jones potential used for the neon dimer.

42

- 10

-5

10

Figure 2.4. Lennard-Jones potential (solid) and Eq. (2.13) modied potentials for rcut = 4.9 a.u. (dashed) and rcut = 4.7 a.u. (dotted), emphasizing the singular region.

43


10 7.5 5 2.5 2.5 5 7.5 10 12.5 15 17.5

- 2.5 -5 - 7.5 - 10

Figure 2.5. Schematic indicating optimal quantum truncation of a = 1.6 a.u. weylet basis functions used to compute bound vibrational (J = 0) level energies for 1 DOF radial Ne2 system. Thick solid/dashed/dotted lines enclose basis used to achieve 2%/0.2%/0.02% well-depth error tolerance. The thin solid line encloses the phase space region corresponding to all vibrational states up to the dissociation threshold, i.e. H(q, p) Emax = 0.

44


10 7.5 5 2.5 2.5 5 7.5 10 12.5 15 17.5

- 2.5 -5 - 7.5 - 10

Figure 2.6. Schematic indicating phase space regions used to truncate 3 DOF a = 1.6 a.u. weylet basis functions to compute rovibrational bound states of Ne2 . Innermost solid line encloses the phase space region corresponding to the dissociation threshold (Fig. 2.5). Next concentric solid line corresponds to Eq. (2.35) with parameters from Table 2.3, Row 1, and outermost concentric solid line to Table 2.3, Row 6. Inclusion of the cap region weylets corresponds to the dashed line extensions.

45

CHAPTER III CUSTOMIZED PHASE SPACE REGION OPERATORS APPLIED TO BASIS SETS 3.1 Introduction For molecular bound state calculations, the choice of basis directly determines the computational eort in solving the quantum Hamiltonian. More specically, the eciency, K/N , where N represents the number of basis functions needed to calculate K eigenvalues at a desired accuracy, has a strong dependence on the degree of correlation between the basis functions and the target system. This relationship has prompted research in basis optimization which focuses on the maximization of the eciency and thus, ultimately, the reduction of CPU eort and memory usage. Symmetrized Weyl-Heisenberg wavelets, or weylets, comprise a type of universal orthonormal basis that can be eectively used to represent any bound molecular system with an eciency approaching perfection (K/N 1) in the large N and K limit, regardless of system dimensionality. Although the weylet basis functions themselves are universal, the truncation of the basis set, achieved using a phase space truncation scheme, 3537 is what tailors the method to individual systems. In contrast, all direct product basis sets (DPBs), 14,31 even those for which individual basis functions are optimized via self-consistent eld or other techniques, 29,99101 exhibit exponential reduction in eciency with f , the number of degrees of freedom (DOFs). 24,29 Consequently, the weylet approach has been applied to direct matrix eigenvalue calculations for model systems of 15 DOFs and beyond, 37 far beyond what would be feasible using a DPB. Recently, the method was successfully applied to a real molecular system, Ne2 (in Cartesian coordinates). 102 Although Ne2 presents a worst-case scenario for the weylet method, in that f and K are small, and includes states near the dissociation threshold, it is still competitive with other state-of-the-art exact quantum dynamics methods. 15

46

Phase space truncation of the weylet basis is eective because individual weylet functions have good phase space localization, and are orthogonal. Achieving both properties together is nontrivial, 53,54 but can be achieved using a momentum-symmetrization modication rst introduced by Wilson 55 and Daubechies et al. 56 In 1 DOF, one starts with a 2x overcomplete set of coherent states (CSs) which are derived from phase space Gaussians and arranged on a doubly-dense lattice (i.e., two CSs per Planck cell) on phase space. Provided the lattice of CSs constitutes a tight frame, a particular linear combination of positive and negative momentum CS pairs then yields a complete 1 DOF orthonormal weylet basis, |i (the general f DOF case is addressed in Sec. 3.2.1). Poirier 3537 later rened the approach by constructing maximally phase space localized weylets, in a computationally tractable manner for quantum dynamic calculations. To understand phase space truncation, we must rst introduce two projection operators:
N

N =
i=1

|i i |

(3.1)

and q K = [Emax H(, p)] . (3.2)

The Wigner-Weyl (WW) phase space representation 32,33,97,98 of K (simply labeled as K ) [see Fig. 3.2(a)] is a function that oscillates about unity in the classically allowed region (QC) of phase space where H Emax , and is damped exponentially outside this region. It can be shown that K QC = [Emax H(q, p)] K (3.3)

in the large K limit, i.e., K can be associated with the classically allowed region of phase space, R. 34 Similarly, each single DOF weylet |i can be associated with a (momentumsymmetrized) pair of blocks (corresponding to the CS pairs mentioned ealier), centered on the lattice sites, which on truncation, comprise a region R [see Fig. 3.1(a)]. 47

If the block size is small (i.e., N and K large), then R closely resembles R and N K, as the region volumes are proportional to basis size. In any event, it is the blocks in the boundary of R that are the leading cause of ineciency, since they overlap R only partially. This eect is more pronounced at larger dimensionalities for a given K value, so that in practice, the limiting diculty of the weylet method is the level of computed accuracy that can be achieved, rather than dimensionality per se. The main purpose of this chapter is to address this limitation by customizing the individual weylet functions (i.e., not just their truncation) for particular applications. Consider that the basis K |i = |i , rather than |i itself, can in principle result in an exact calculation of the lowest K eigenvalues. In the phase space picture, this projection eectively transforms Fig. 3.1(a) to Fig. 3.1(b) which is seen to yield R = R even when N and K are not large. Note that the peripheral basis functions are most aected by the projection transformation. In practice, the above picture is complicated by additional concerns. First, the |i are not orthogonalthough this is easily remedied via orthogonalization 52 or direct solution of the generalized eigenvalue problem Hv = E Sv , (3.4)

where H and S are the Hamiltonian and overlap matrices, respectively, in the nonorthogonal basis representation, and (E,v) is the (eigenvalue,eigenvector) pair. More importantly, |i might in principle be linearly dependent or nearly so, and in fact this is almost certain to cause numerical instabilities if |i is a random basis or even DPB, at suciently large f or N . Use of phase space truncated weylets as the starting basis, |i , almost completely alleviates this diculty, however. Finally, K is not known a priorithough QC is known, and is likely to constitute a worthy substitute. K Fortunately, the mathematical development of QC , or what we call the phase space K region operator (PSRO), and its action on arbitrary functions, has been extensively studied by Bracken, Doebner, and Wood. 68,69 With their insights, we have developed 48

very ecient projected basis sets, QC |i = |i K obtained in weylet calculations.

(1)

(the (1) superscript will be ex-

plained later), which are shown to greatly increase the accuracy levels that can be

A second idea explained in this chapter involves the use of momentum-symmetrized Gaussians (SGs), |i , rather than weylets, |i , as the initial basis. SGs centered on the same lattice sites as their weylet counterparts span nearly the same subspace, and individual SGs are nearly orthogonal due to the momentum symmetrization. 3537,41 Moreover, the PSRO projected subspace of the SGs and weylets are much closer still, as compared to the unprojected case. Although |i ecient than |i weylets, |i
(1) (1) (1)

is still anticipated to be more

, the latter are far more convenient to work with numerically.


(1)

Our results indicate impressive improvements in eciency for PSRO-modied , and SGs, |i , on a wide range of model systems and dimension-

alities. The most noticeable improvements are in cases where N and K are small, as expected. Moreover, the eciencies of the two projected basis sets are nearly the same, with SGs actually more ecient than weylets in some cases, making them a competitive basis if one can develop inexpensive techniques for the PSRO modication. We have also found that multiple applications of the PSRO results in further increases in eciency for both basis sets, up to a point. The rationale here is that higher powers of QC are more nearly idempotent, and therefore presumably closer to K the exact projection operator K . The remainder of this chapter is organized as follows. The next section presents a brief description of the weylets and SGs in the single and general f DOF cases (3.2.1). Also, the theoretical development of the PSRO is discussed (3.2.2) and applied to the special case of the harmonic oscillator (3.2.3). Section 3.3 provides the details of the numerical application of the PSRO to both basis sets. Section 3.4 presents the results, followed by a discussion (3.4.4) of all the data presented, and possible future developments.

49

3.2 Theoretical Background 3.2.1 Weylets and Momentum-Symmetrized Gaussians (SGs) A complete analysis of the construction of the weylets and SGs are documented in Refs. [35]-[37] and Ref. [102], respectively; thus, only the mathematical form of the basis functions will be presented in this chapter, along with a brief description. First, the SGs for the single DOF case in h = 1 units (as will be presumed throughout this chapter) have the form: st (q) = 4a2
1/4

cos ta q (s + 1/2) /a

ea

2 (qs/a)2 /2

(3.5)

The previous index i is replaced with the two indices, s and t, signifying lattice sites (qs , pt ) where the unsymmetrized pair of phase space Gaussians are centered. The lattice sites are specied by qs = (s/a) and pt = ta , with the parameter a related to the aspect ratio of the lattice. Momentum symmetrization requires t to be restricted to positive half integers, i.e., t = {0.5, 1.5, 2.5, . . .}, but s = {. . . , 1, , + 1, . . .} for any real . The 1 DOF weylet functions can themselves be expanded into SGs,
mmax

st (q) =
m,n=mmax

n (1)( 2 +mt) cmn s+m,t+n (q)

(3.6)

where m and n are even integers and cmn are coecients listed in Refs. [35] and [36]. The cmn decay exponentially with respect to |m|+|n|; thus, in practice, one can apply a correlated diamond summation, less than 106 . For the general f DOF case, the SGs and weylets are products of the 1 DOF functions:
|m|+|n|mmax . 35,36

In this chapter, the bound on

the summation is chosen to be mmax 6 outside of which all cmn have magnitudes

s,t (q) =
j=1

sj tj (qj )

(3.7)

50

s,t (q) =
j=1

sj tj (qj )

(3.8)

where q = (q1 , q2 , . . . , qf ), s = (s1 , s2 , . . . , sf ), and t = (t1 , t2 , . . . , tf ). Each weylet of Eq. (3.8) is approximately represented by groups of 2f blocks, with centers at (qs1 , pt1 , qs2 , pt2 , . . . , qsf , ptf ), each of volume ()f [see Fig. 3.1(a)]. Thus, the set of 2f N blocks, R , has a total volume of N (2)f , and similarly, the set of K target eigenstates, R, has a total volume of K(2)f . The SGs of Eq. (3.7) follow the same design, except that individual functions correspond to phase space spheres, rather than blocks. The spheres overlap slightly, reecting nonorthogonality of the SG basis, and also leading to a somewhat lower eciency of the SGs compared to the weylets. 3.2.2 Phase Space Region Operator (PSRO) In the f DOF case, the action of an arbitrary smooth operator A on the basis function |s,t is (As,t )(q) = q|A|q q |s,t df q (3.9) in

where s,t represents either the weylets, s,t , or SGs, s,t . The term q|A|q expression: q|A|q = 1 (2)f A [(q + q )/2, p] eip(qq ) df p

Eq. (3.9) is known as the conguration kernel of A, and can be represented by the

(3.10)

where A is the result of the WW mapping of A. 32,33,97,98 The observable of interest is the PSRO, i.e., A(q, p) = QC (q, p) = [Emax H(q, p)] K

= Emax
j=1

p2 /(2mj ) j

V (q)

(3.11)

for a system with a quantum Hamiltonian in the kinetic-plus-potential form (H = T + V ) in Cartesian coordinates. Using Eqs. (3.9) and (3.10), the PSRO-modied

51

basis functions become (QC s,t )(q) = K 1 (2)f QC [(q + q )/2, p] eip(qq ) q |s,t df p df q . K (3.12)

Further simplication is possible for f = 1 68,69 as shown below. First, Eq. (3.11) can be rewritten as

QC (q, p) = Emax p2 /(2m) V (q) K


1 0

xmin q xmax and pmax (q) p pmax (q) otherwise

(3.13)

where pmax (q) =

2m [Emax V (q)]. The parameters xmin and xmax dene the bound-

aries where pmax (q) is real and pmax (xmin ) = pmax (xmax ) = 0. At Emax equal to the dissociation of the bound system, one or both of the parameters extend to innity, and, in practice, nite bounds need to be chosen when using QC in computations. K Plugging Eq. (3.13) into (3.10), the 1 DOF conguration kernel of the PSRO can be reduced to:
1

q|QC |q K

pmax [(q+q )/2] eip(qq ) dp (2) pmax [(q+q )/2]

xmin

q+q 2

xmax

otherwise

sin (qq ) pmax ( q+q ) 2


(qq )

2xmin (q + q ) 2xmax otherwise

(3.14)

Finally, placing Eq. (3.14) into Eq. (3.9), the 1 DOF version of Eq. (3.12) simplies to (QC st )(q) K =
2xmax q 2xmin q

sin (q q ) pmax ( q+q ) 2 (q q ) 52

st (q )dq .

(3.15)

3.2.3 PSRO for the Harmonic Oscillator (HO) Consider the multidimensional isotropic harmonic oscillator (HO), where the masses and frequencies are all equal to unity in atomic units, i.e., mj = j = 1 a.u. for q j = 1, . . . , f so that H( , p) = (1/2)
f j=1 f p2 j=1 (j

+ qj ). The spherical symmetry of this 2

system renders an exact analytical solution of Eq. (3.12) possible. The QC phase space region R = {(q, p) | 0
2 p2 + qj 2Emax } [where QC (q, p) = 1] is a K j

2f dimensional hypersphere centered at the phase space origin. The operator QC can always be written in the form K QC = K
i

wi |i i |

(3.16)

where wi are the eigenvalues of QC and |i the corresponding eigenfunctions. We K then have (QC s,t )(q) = K
i

wi i |s,t i (q) .

(3.17)

For the isotropic HO system, the wi s and corresponding |i s are known analytically, as are the overlaps i |s,t . As shown in Refs. [68] and [69], the eigenfunctions of QC are also those of K , K i.e., the HO eigenstates (see Appendix B), |i = |n = |n1 |n2 . . . |nf (3.18)

where nj is a nonnegative integer representing the quantum excitation of the j th DOF of the HO. The eigenvalues can be determined by wn (Emax ) = n|QC |n . K Using the WW formalism, Eq. (3.19) becomes (3.19)

wn (Emax ) = =

QC (q, p) Wn (q, p)df qdf p K


R

Wn (q, p)df qdf p

(3.20)

53

where Wn (q, p) represents the WW phase space representation 32,33,97,98 of the pure state density operator of each HO eigenfunction, i.e., |n n|. The analytical expression is
f

Wn (q, p) =
j=1

Wnj (qj , pj )

(3.21)

where

(1)nj 2 2 2 Wnj (qj , pj ) = Lnj [2(qj + p2 )]e(qj +pj ) j

(3.22)

and Lnj is a Laguerre polynomial of degree nj . The above equations imply that the eigenvalues wn (Emax ) depend only upon nS =
f j=1

nj instead of n itself, i.e., wn (Emax ) = wnS (Emax ) which is proven in Refs. [68]

and [69] (see Appendix C). Upon integration of Eq. (3.20), a recurrence relationship can be derived: 69

(1)nS +1 (nS )! 2Emax e (4Emax )f Lf S (4Emax ) wnS +1 (Emax ) wnS (Emax ) = f 1 n 2 (nS + f )! and w0 (Emax ) = 1 (f 1)!
2Emax 0

(3.23)

tf 1 et dt

= P (f, 2Emax )

(3.24)

where Lf S is the associated Laguerre polynomial, and P (f, 2Emax ) is the incomplete n gamma function. The closed form of Eq. (3.23) is
nS 1

wnS (Emax ) = P (f, 2Emax ) +


k=0

(1)k+1 k! 2Emax e (4Emax )f Lf (4Emax ) k 2f 1 (k + f )!

(3.25) 2Emax .

for nS > 0. As shown previously, the radius of the hyperspherical region R is Given that R has both a volume V2f = K(2)f and V2f = (2Emax )f , f! 54

(3.26)

one can derive a useful direct relation between Emax and K: Emax = (Kf !)1/f which in practice is more useful than working with Emax . Finally, the PSRO used in the calculation of the K HO eigenvalues and eigenfunctions in the decomposed form of Eq. (3.16) is QC = K
nS nmax

(3.27)

wnS (Emax )|n n|

(3.28)

where the sum above includes all states |n that have nS nmax . The nonnegative integer parameter nmax is chosen such that a desired level of convergence is achieved in the nal calculation. 3.3 Numerical Implementation 3.3.1 Morse Oscillator (1 DOF) For 1 DOF realistic potentials, we resort to explicit numerical integration of Eq. (3.15). Although the application to large K and N is computationally expensive, the small problem sizes of this study are appropriate as the focus is to determine whether this method can achieve signicant increases in eciency. Future projects may involve the development of new time and memory saving techniques or approximations (similar to Sec. 3.3.2) to enhance application of this PSRO method to larger problems. We choose to examine the Morse oscillator, for which p2 q q q H(, p) = + D(e2 2e) . 2 (3.29)

The parameters, D = 12.000 and = 0.2041241 are chosen so that R has a volume of 48 a.u. at dissociation energy Emax = 0, thus signifying that there are 24 bound states (K = 24). For comparison, the eigenvalues wi or energy values of the bound states can be analytically determined: 103 1 2 1 wi = D + 2D i i 2 2 2 55
2

(3.30)

where i = 1, . . . , 24. The PSRO-modied weylets and SGs are numerically computed using Mathematica. A set of points for q spaced at 0.02 a.u. increments between boundaries Xmin
(1) and Xmax are used to dene each of the N modied basis functions, st (q), using (1) Eq. (3.15). The boundaries, Xmin and Xmax span slightly beyond the PSRO region be(1) (1) (1)

cause each projected function, st (q), extends outside the [xmin ,xmax ] range, although
(1) the extension does decay rapidly. In practice, xmin Xmin and Xmax xmax are chosen (1)

(1)

to be approximately between 1 and 2 a.u. which allows sucient convergence of the elements of H and S. For functions resulting from the application of the PSRO operator p > 1 times, st (q) = [(QC )p st ](q), one needs to determine beforehand all appropriate boundK aries for all modied basis functions from st to st . This is done in a reverse fashion
(p) by rst choosing sucient boundaries, Xmin and Xmax , for st (q), i.e., xmin Xmin (p) and Xmax xmax are between 1 and 2 a.u., and then using the following equations for (p) (p) (p) (1) (p) (p)

the 1 b < p limits:


(b) Xmin

Xmin + (p b)(xmin xmax )

(p)

(p b) even (p b) odd

X (p) + (p b)(x min xmax ) + xmax + xmin max

(3.31)

and
(b) Xmax =

X (p) min

(p) Xmax + (p b)(xmax xmin )

(p b) even (p b) odd

+ (p b)(xmax xmin ) + xmax + xmin

( . 3.32)

One nds that the functions of lower modication need larger boundaries, i.e., Xmin <
(b ) (b) Xmin and Xmax > Xmax for b < b . (b )

(b)

3.3.2 Morse/Harmonic Oscillator (2 DOF) For systems where f > 1, the Eq. (3.12) integrations are rather costly, even for f = 2, for which four-dimensional integrals are needed. In such cases however, one

56

may apply a separable PSRO modication to greatly reduce the computational cost. In this section, we apply this idea to the 2 DOF Morse/HO system, 1 ( )2 x y y x H( , y , px , py ) = (2 + p2 ) + px y + D(e2 2e + 1) , 2 2 which becomes coupled via rotation of the coordinates: x = x cos 10o + y sin 10o and y = x sin 10o + y cos 10o . (3.35) (3.34) (3.33)

Instead of using the QC (x, y, px , py ) PSRO which is coupled, one applies the sepK arable approximation QC (x, px )QC (y, py ) where Kx Ky K. Since the s,t basis is Kx Ky also separable, one obtains

(QC s,t )(x, y) K

2xmax x 2xmin x

sin (x x ) pxmax ( x+x ) 2 (x x ) sin (y y ) pymax ( y+y ) 2 (y y )


(1)

sx tx (x )dx

2ymax y 2ymin y

sy ty (y )dy

= sx tx (x)sy ty (y)

(1)

(3.36)

where pxmax (x) =

2[Emax Vx (x)] and pymax (y) =

2[Emax Vy (y)]. Any of a

number of techniques may be used to obtain suitable 1 DOF marginal potentials (Vx and Vy ), with the primary criterion being that QC QC resemble QC as closely as Kx Ky K possible. For this chapter, we use the method of Ref. [74]. The marginal potentials, Vx and Vy , resulting from the optimization 74 can be found by simply minimizing the original potential with respect to y and x, respectively, 19,104 i.e., Vx (x) = min[V (x, y)]y 57 (3.37)

and Vy (y) = min[V (x, y)]x . (3.38)

No adjustments need to be made for the above equations, since min[V (x, y)] = 0. The classically allowed region corresponding to the separable PSRO has a cylindrical shape composed of the product of 2 two-dimensional phase space regions, Rx Ry . This region contains corners not present in the nonzero region of QC . K Thus, the separable PSRO is dierent from, and less eective than QC , in the sense K that it fails to smooth out all of the peripheral lattice states due to the wasted space of the corners. 3.3.3 Harmonic Oscillator (HO) For the HO system, one does not need to bother with numerical integrations needed for the realistic cases presented in Secs. 3.3.1 and 3.3.2. Instead, an analytical representation of Eq. (3.12), as presented in Sec. 3.2.3, can be used; thus, one can avoid the numerical errors coming from the computationally expensive integrations. Ultimately, we want to represent the HO Hamiltonian in the PSRO-modied SG |s,t or weylet |s,t basis. Using Eq. (3.28), the orthonormality property of the HO eigenfunctions, i.e., n|n = n1 n1 n2 n2 . . . nf nf , and the eigenvalue relationship H|n = (nS + f /2)|n , the Hamiltonian matrix elements in the modied SG basis are given by
(1) (1)

(1) H

s,t,s ,t

= =

(1) (1) s,t | H |s ,t 2

wnS (Emax )
nS nmax

nS +

f 2

f j=1

sj tj |nj nj |sj tj

. 3.39) (

To solve Eq. (4.2), we also need the overlap matrix

(1) S

s,t,s ,t

s,t |s ,t 58

(1)

(1)

f j=1

=
nS nmax

wnS (Emax )

sj tj |nj nj |sj tj

(3.40)

Similar equations apply for the weylet basis. The overlaps are given explicitly as follows:

st |n and

2 n/2 2 e 4 (s +t ) Re (s + it)n ei(/2)t(s+1) 2n1 n!

(3.41)

mmax

st |n

=
m,n=mmax

n (1)( 2 +mt) cmn st |n

(3.42)

In the above expressions, we have chosen the parameters a = 1 a.u., and


(p)

= 1/2.

The generalization for |s,t is obtained by replacing [wnS (Emax )]2 with [wnS (Emax )]2p in Eqs. (3.39) and (3.40). 3.4 Results and Discussion 3.4.1 Results for Morse Oscillator (1 DOF) For the 1 DOF Morse oscillator system, we optimized our calculations for the lowest K = 6 eigenstates. These are suciently far below dissociation that the QC eigenstate region R (Emax = 6.7500 a.u.) is convex, suitable for the SGs and weylets, yet the last few eigenstates are high enough to clearly exhibit anharmonic behavior (note the shape of R in Fig. 3.3). The N = 9 basis functions are chosen to suciently cover R and are each modied by the PSRO QC . In practice, a suitable 6 basis size N depends strongly on K. If N is insuciently larger than K, then the PSRO is ineective at computing all K desired states to sucient accuracy. On the other hand, if N is too large, then the overlap matrix S is ill-conditioned (eigenvalues of S are too small) preventing generalized eigenvalue routines from solving Eq. (4.2).

59

The weylet basis functions were chosen as in Fig. 3.3. The selection is very similar to what would be obtained via the phase space truncation criterion used in Secs. 3.4.3 and 3.4.2. We chose a = 1.8282 a.u. such that the heights of the lattice cells equal the maximum extent of R in the positive and negative momentum direction. Also, the position (horizontal) shift of the rectangles are adjusted ( = 0.0273 a.u.) so that the right edge of the block furthest along the position axis (in the positive direction) corresponds to the right boundary xmax = 5.3058 a.u. of R. The left boundary xmin = 2.4871 a.u. of R extends slightly further than one basis function; thus, we added a ninth basis function on the negative side for sucient coverage. We considered both the SGs and weylets as basis sets, as well as, their PSROmodied versions, up to p = 3. The triple modied functions, st (q), lie between
(3) chosen boundaries of Xmin = 4 and Xmax = 7 a.u., suciently outside of the PSRO (3) (3)

bounds, xmin and xmax . By Eqs. (3.31) and (3.32), the bounds of the b < p functions
(2) (1) are Xmin = 11.9743, Xmin = 19.5859, Xmax = 14.6116, and Xmax = 22.5859 a.u. (2) (1)

We report in Table 3.1 the absolute errors between the calculated and analytical values for all basis set types except for the p = 3 case, since these show no improvement over the p = 2 case. In both the weylet and SG case, the greatest reduction in the absolute error from the unmodied to the modied case occurs in the lowest eigenstates. For example, a near-four-order-of-magnitude improvement in accuracy is shown for the lowest two eigenvalues of the p = 2 weylets, |st , and SGs, |st , relative to the corresponding unmodied basis sets, |st and |st . The |st
(1) (2) (2) (1)

and |st

sets show accuracy

improvements ranging from 1-3 orders of magnitude for all 6 targeted eigenvalues. In general, both of the SGs and weylets follow the same pattern vis-`-vis the PSRO a modication, and exhibit comparable accuracies for the same p value, even in the unmodied case, p = 0. This advocates strongly in favor of using SG basis functions for practical applications. The main conclusion, though, is that PSRO modication is extremely eective for either basis set, with the largest relative error for the targeted K = 6 energies being only around 3 105 for p = 2. Table 3.1 also indicates errors 60

for the remaining N K = 3 eigenvalues, which show only small improvements and sometimes loss of accuracy by the PSRO modication, thus demonstrating the ability of the method to single out just the desired K eigenvalues from the others. 3.4.2 Results for Morse/Harmonic Oscillator (2 DOF) We developed a separable PSRO that corresponds to a product region, Rx Ry , where both components are projected regions onto the (x, px ) and (y, py ) axes, 74 respectively, of R, the classically allowed region of the lowest K = 14 eigenstates (Emax = 5.0700 a.u.) of the 2 DOF Morse/HO system. We applied the PSRO to each of the SG basis functions (N = 49) selected by the basis truncation criterion H(qs , pt ) Ecut = 9.0000 a.u. 3537 In parts (a) and (b) of Fig. 3.4, the inner curves (solid line) denote Rx and Ry , respectively, with boundaries of xmin = 3.1911 and xmax = 3.1797 a.u. for the former and ymin = 2.4942 and ymax = 5.0820 a.u. for the latter. The projections of R , the classically allowed region of the SG basis set (actually, more reective of the weylets), are highlighted by the dotted lines and, in a fashion similar to PSRO regions, are labeled as Rx and Ry . The parameters of the SG basis set were chosen to produce optimal computed eigenvalues for the unmodied basis set and are listed as follows: ax = ay = 0.7979, a.u. The analytical eigenvalues of the Morse/harmonic oscillator (column 2 of Table 3.2) are equal to the sum of the Morse [Eq. (3.30)] and 1 DOF harmonic eigenvalues, exactly like the uncoupled case [Eq. (3.33)], since the coupling is only due to a rotation on the (x, y) plane. Improvements up to two orders of magnitude are shown for the modied versus unmodied case. Note that there is not a clear divorcing of the rst 14 and the remaining 6 reported in Table 3.2, as there is in Table 3.1; although, the accuracy improvements diminish rapidly beyond the K = 14 cuto. This behavior is due to the articial separability used in the 2 DOF PSRO. The method is nevertheless quite eective at improving the accuracy of the desired eigenvalues.
x

= 0.4064, and

= 0.0926

61

3.4.3 Results for Harmonic Oscillator (HO) The isotropic HO up to 4 DOFs was solved using 6 dierent basis sets: |s,t , |s,t , |s,t , |s,t , |s,t , and |s,t (Tables 3.3-3.6). In all cases, the basis truncation criterion H(qs , pt ) Ecut 3537 was used to obtain a representational basis of size N . We also chose K = N , i.e., we set the volume of the PSRO region R to equal that of the basis region R [Fig. 3.1(a)]. In Fig. 3.2, we can see, pictorially, how the PSRO aects the weylets in the 1 DOF case. Part (a) shows the WW phase space representation of the projection operator containing the lowest 6 HO eigenstates. Part (b) gives the same setup for the 6 weylets chosen as a basis to solve the HO Hamiltonian [the QC region of (b) is the collective region of blocks in Fig. 3.1(a)], and part (c) represents the same weylet set under a single PSRO modication. The modied weylets show concentric ring-like features much like the HO eigenstates resulting in a considerable amount of improvement in eciency over the unmodied set (Table 3.3)although, one can still observe residual rectilinear weylet features in the slight box-like cornering of the rings. The analytical eigenvalues of the isotropic HO are (nS + f /2), with degeneracy
(1) (2) (1) (2)

deg(nS , f ) =

nS + f 1 f 1

(3.43)

for f > 1 (the eigenvalues are of course nondegenerate for f = 1). They are compared to the actual calculated eigenvalues using the dierent basis sets, and Tables 3.3-3.7 report how many of the computed values fall within relative accuracies of 2 102 , 2 103 , 2 104 , and so on (which correspond to error tolerances of (0.2)f , (0.02)f , (0.002)f , . . . , respectively. 37 ) Table 3.7 stands apart from the other HO tables in that it reports the eects of basis set eciencies as one increases p, the power of the PSRO operator.

62

For this analysis, we dene the eciency to be L/N , where L is the number of eigenvalues actually computed to a certain relative accuracy as opposed to the number of targeted eigenstates, K, used to dened the PSRO (K = N for the HO case). This eciency for f = 2 at 2 104 relative accuracy is plotted against N in Fig. 3.5 which demonstrates similar eciency curves (apart from a shift) for all six basis types. Although |s,t is now substantially more ecient than |s,t , the eect is still no greater than that of the PSRO modication itself, in that |s,t and |s,t
(1) (p) (p)

are comparable. Note the large increase in eciency from N = 176 to 2 340; this eciency cli 37 shifts to the right with increasing dimensionality, and is the leading cause of accuracy restrictions on weylet calculations at higher dimensionalities. The fact that the PSRO improvements are comparable on and o the cli is thus highly encouraging. Fig. 3.6 clearly shows that the eciencies (2 102 accuracy) of all basis types do not decay exponentially as the DOFs increase (N 10 000 is held constant). The results of the weylets and the SGs are very similar in these plots: the weylets are slightly better than the SGs in the unmodied case, and for both the single and double PSRO-modied sets, the results are essentially identical (modied weylets are not shown). These similarities are only true at the 2 102 accuracy, whereas higher accuracies display signicant improvement of the weylets over SGs, as shown in the previous analysis. The most important result is that the eciency of the modied functions decreases signicantly slower than the unmodied with increasing f ; thus, the benets of PSRO modications are expected to be substantially greater at larger dimensionality. In general, the PSRO projection operator signicantly improves the eciency of both the SGs and weylets as demonstrated by all tables and plots. Also, the modications produce bases that allow for the calculation of a large number of eigenvalues at extremely high accuracies, otherwise not attainable. For example, in the 2 DOF case (Table 3.4), there are L = 757 (N = 20 864) eigenvalues that fall within the relative accuracy of 2 1012 using the |s,t basis. Even the less ecient basis |s,t 63
(2) (2)

is impressive at L = 305 at similar conditions. A particularly surprising result is to be found in Table 3.7, for which over 1 000 eigenvalues are computed to 2 107 accuracy, using the |s,t
(11)

basis of only N = 2 340 functions. Thus, the PSRO to the

11 power is very close to the actual projection operator K . 3.4.4 Discussion In conclusion, the PSRO modication of the weylets and SGs produce new nonorthogonal bases with eciencies far better than the unmodied setsespecially useful in cases where the latter are limited in accuracy, or when K and N are small. For realistic 1 DOF systems, one can successfully apply the PSRO to either the weylet or SG basis via numerical integration to gain large improvements in accuracy, which for the most part, is not computationally expensive. However, the same computations do become a serious problem at larger dimensionalities. The favorable symmetry of the model HO system allowed the analyses of higher DOF calculations, which would otherwise not be possible, showing that eciency and accuracy improvements get even better as one increases f . This promising result provides motivation to explore algorithms that would reduce computations involving real systems of multiple DOFs, for example, the separable PSRO of Sec. 3.3.2. For large basis sizes N and number of PSRO modications p, the computational eort becomes expensive, but is clearly worth exploring further. One idea for possible future works involves the application of Monte Carlo integration methods to the phase space integration in Eq. (3.12).

64

Table 3.1.

The absolute dierences between computed and analytical [Eq. (3.30)] values (in a.u.) for the lowest 9 energy levels of the 1 DOF Morse potential, using various basis sets of N = 9 functions in each case. |st |st
(1)

State index 1 2 3 4 5 6

|st

(2)

|st 0.104 36 0.230 72 0.350 87 0.468 68 0.620 50 0.397 10

|st

(1)

|st

(2)

0.087 78 0.000 26 0.000 02 0.218 67 0.004 05 0.000 04 0.341 36 0.003 13 0.000 19 0.462 69 0.006 34 0.000 39 0.605 69 0.020 51 0.000 28 0.381 79 0.013 20 0.000 08

0.000 32 0.004 04 0.003 69 0.006 09 0.020 14 0.014 95

0.000 02 0.000 03 0.000 18 0.000 40 0.000 29 0.000 11

7 8 9

0.587 65 0.143 43 0.083 23 1.808 86 1.645 99 1.130 88 2.624 91 3.237 55 3.074 34

0.573 73 1.745 77 2.668 52

0.144 73 1.669 11 3.128 43

0.083 07 1.126 17 3.064 19

65

Table 3.2.

Lowest 20 eigenvalues of the 2 DOF Morse/HO system (in a.u.). Analytical eigenvalues are in column 2, whereas columns 3 and 4 present absolute dierences between computed (with indicated basis of N = 49 functions) and analytical values. exact |s,t |s,t
(1)

State index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0.994 79 0.110 71 0.007 23 1.953 12 0.160 49 0.007 92 1.994 79 0.137 23 0.019 68 2.869 79 0.182 10 0.011 07 2.953 12 0.152 55 0.013 31 2.994 79 0.287 10 0.047 11 3.744 79 0.109 67 0.003 07 3.869 79 0.264 59 0.025 60 3.953 12 0.228 14 0.007 93 3.994 79 0.512 73 4.578 12 0.276 80 4.744 79 0.233 47 4.869 79 0.498 17 4.953 12 0.468 67 4.994 79 0.786 11 5.369 79 0.612 01 5.578 12 0.495 62 5.744 79 0.461 22 5.869 79 0.608 98 5.953 12 0.786 84 0.130 02 0.022 38 0.016 40 0.048 03 0.063 08 0.368 03 0.049 81 0.042 01 0.042 82 0.093 21 0.336 79

66

Table 3.3.

Results for the 1 DOF isotropic HO system computed using six dierent basis sets. The values under the dierent basis columns indicate the number of eigenvalues computed to relative accuracy indicated in column 3. N 6 accuracy 2 102 2 103 2 104 2 105 2 102 2 104 2 106 2 108 2 1010 2 1012 2 102 2 104 2 106 2 108 2 1010 2 1012 2 1014 |st 2 0 0 0 |st 5 3 1 0
(1)

Ecut (a.u.) 4.0

|st 6 5 2 1

(2)

|st 2 0 0 0

|st 6 3 0 0

(1)

|st 6 5 3 1

(2)

110.0

108

84 53 31 17 4 0

97 75 47 20 6 1

104 88 50 29 13 2

81 48 22 7 0 0

97 54 37 18 3 0

104 83 47 20 5 0

177.0

180

144 105 74 52 28 6 0

157 118 93 60 35 24 7

172 131 98 75 53 27 7

139 97 61 35 17 7 1

156 108 83 54 28 13 5

172 121 92 58 36 21 5

67

Table 3.4.

Number of accurately computed eigenvalues for the 2 DOF isotropic HO system using six dierent basis sets (consult Table 3.3 for further details). N 176 accuracy 2 102 2 103 2 104 2 105 2 102 2 104 2 106 2 108 2 1010 2 102 2 104 2 106 2 108 2 1010 2 1012 2 1014 2 102 2 104 2 106 2 108 2 1010 2 1012 2 1014 2 1016 |s,t 73 31 3 0 |s,t 69 29 0
(1)

Ecut (a.u.) 20.0

|s,t

(2)

|s,t 71 16 0 0

|s,t 67 8 0

(1)

|s,t

(2)

139

154 113 49 8

140

154 111 42 5

70.0

2 340

1 375 510 32 1 0

1 900 882 279 23 0

2 097 1 133 355 45 1

1 341 363 37 1 0

1 899 575 146 6 0

2 099 919 223 11 0

140.0

9 956

6 730 3 597 1 906 712 120 4 0

8 167 4 968 2 563 913 235 29 0

8 826 5 740 2 874 1 218 416 56 1

6 557 2 966 1 097 280 46 0 0

8 153 3 748 1 806 672 94 0 0

8 818 4 788 2 279 773 134 1 0

205.0

20 864

14 917 9 037 5 416 2 738 998 47 0 0

17 406 11 440 7 043 3 320 1 424 637 15 68 2

18 661 13 053 7 710 3 959 1 844 757 24 2

14 569 7 808 3 679 1 500 620 81 0 0

17 356 9 221 5 260 2 618 881 142 3 0

18 647 11 103 6 170 2 990 1 129 305 5 1

Table 3.5.

Number of accurately computed eigenvalues for the 3 DOF isotropic HO system using six dierent basis sets (consult Table 3.3 for further details). N 176 accuracy 2 102 2 103 2 104 2 102 2 103 2 104 2 105 2 102 2 103 2 104 2 105 2 106 2 107 2 108 |s,t 50 1 0 |s,t 22 3
(1)

Ecut (a.u.) 9.0

|s,t 49 19

(2)

|s,t 50 0 0

|s,t 25 3

(1)

|s,t 52 19

(2)

137

141

140

141

21.5

1 928

661 185 18 1

1 443 482 118 1

1 595 851 256 20

622 89 4 0

1 445 452 35 0

1 594 851 171 3

40.0

9 552

4 067 1 728 517 119 18 0 0

7 188 3 095 1 295 267 66 9 0

7 985 4 446 1 905 502 105 17 1

3 884 1 122 211 28 0 0 0

7 195 2 934 568 103 12 0 0

7 994 4 355 1 317 196 29 1 0

69

Table 3.6.

Number of accurately computed eigenvalues for the 4 DOF isotropic HO system using six dierent basis sets (consult Table 3.3 for further details). N 144 accuracy 2 102 2 103 2 104 2 102 2 103 2 104 2 102 2 103 2 104 2 105 |s,t 15 0 0 |s,t 90 5 0
(1)

Ecut (a.u.) 6.5

|s,t 90 14 4

(2)

|s,t 15 0 0

|s,t 90 5 0

(1)

|s,t 90 20 4

(2)

13.0

1 616

372 25 0

1 174 118 1

1 212 388 16

355 5 0

1 168 99 0

1 212 411 11

22.0

12 720

3 679 632 45 0

9 283 1 938 350 1

10 098 4 098 637 1

3 473 282 1 0

9 300 1 827 46 0

10 090 4 045 311 0

70

Table 3.7.

Number of accurately computed eigenvalues for the 2 DOF isotropic HO system at Ecut = 70.0 a.u. (N = 2 340) using modied weylet basis with the PSRO applied from 3 to 11 times. Beyond p = 11, numerical diculties arise in the diagonalization of the Hamiltonian matrix. 3 2 161 1 772 1 343 792 441 246 71 17 1 0 5 7 9 11 2 242 2 121 1 956 1 719 1 385 1 056 689 417 223 92

accuracy 2 102 2 103 2 104 2 105 2 106 2 107 2 108 2 109 2 1010 2 1011

2 204 2 218 2 226 1 933 2 039 2 088 1 633 1 778 1 870 1 142 1 439 1 610 690 392 142 28 4 0 986 627 302 106 24 1 1 218 835 507 256 86 28

71

p ( a.u.) 4 2 q ( a.u.)

-4

-2 -2 -4

p ( a.u.) 4 2 q ( a.u.)

-4

-2 -2 -4

Figure 3.1.

Classically allowed (QC) region for the weylets and HO PSRO in the 1 DOF case. The basis truncation criterion H(qs , pt ) Ecut = 4.0 a.u. gives the weylet basis of N = 6 functions (Table 3.3). In part (a), the QC region of each weylet is represented by two squares each with volume a.u. symmetrically placed about the q axis. The two squares with small unlled circles represent the weylet |1/2,3/2 . The larger unlled circle is the QC region R corresponding to the PSRO QC (K = N = 6). The K same weylets modied by the exact projection K are approximately given in part (b).

72

0.5 4 -4 2 -2 0 2 4

0 -2 -4

0.5 4 -4 2 -2 0 2 4

0 -2 -4

0.5 4 -4 2 -2 0 2 4

0 -2 -4

Figure 3.2.

Wigner-Weyl representation (1 DOF case) of projection operators consisting of three dierent sets of N = 6 (a) HO eigenstates; (b) weylets; (c) PSRO-modied weylets. Dierent than what is stated in Sec. 3.1, all plots oscillate about (1/2), instead of unity, due to a discrepant normalization. In part (b), there are quantum interference fringes that emerge from the momentum symmetrization. The modied weylets in part (c) do not have these fringes and adopt the ring-like pattern of the true HO eigenstates, rendering them a more ecient basis for representing the HO system (see Table 3.3). 73

p ( a.u.) 4 2 q ( a.u.) -4 -2 -2 -4 2 4 6

Figure 3.3.

Classically allowed region for the 1 DOF weylets and Morse PSRO. The egg-shaped region R represents the 1 DOF Morse system at Emax = 6.7500 a.u. There are K = 6 eigenstates at that energy (area inside is 12 a.u.). Filled circles indicate the centers of the N = 9 basis functions used for the calculation of the 6 corresponding eigenvalues (Table 3.1).

74

px ( a.u.) 4 2 x ( a.u.) -4 -2 -2 -4

py ( a.u.) 4 2 y ( a.u.) -4 -2 -2 -4

8 10

Figure 3.4.

Classically allowed region of the separable PSRO for the 2 DOF Morse/HO system. Rx and Ry are shown in parts (a) and (b), respectively, as the inner solid curve. Rx and Ry , reective of the basis set (N = 49) of SGs, are also shown in parts (a) and (b) outlined by the dotted lines.

75

0.6

Efficiency L/N

0.4

0.2

10000

20000

Figure 3.5.

Eciency versus N for the 2 DOF HO at 2 104 relative accuracy (Table 3.4). The solid lines correspond to the weylets either unmodied (circles), single PSRO-modied (squares), or double PSRO-modied (triangles). The dashed lines represent the SGs.

76

0.8

Efficiency L/N

0.6

0.4

0.2

DOF

Figure 3.6.

Eciency versus DOFs at N 10 000 held constant for the HO system. The three data points for each line, starting from the left 2 DOF point, represent basis sizes of N = 9 956, 9 552, and 12 720, with L at relative accuracy of 2 102 . The labels for the plots are the same as that in Fig. 3.5. Only the SGs are plotted for the modied cases since they are very similar to the weylets.

77

CHAPTER IV PARALLEL PREPROCESSED SUBSPACE ITERATION METHOD 4.1 Introduction For iterative eigenvalue solvers, a common goal is to create a subspace invariant under the N N matrix A in question (assumed to be real, symmetric, and sparse throughout this chapter). Although this subspace is spanned by a basis of select eigenvectors of A, one need not have any prior knowledge of the eigenvectors or the corresponding eigenvalues in order to iteratively converge towards the target invariant subspace (ISUB). In the physics/chemistry community, the ISUB is often represented as a density or projection matrix , a uniformly mixed ensemble of the appropriate eigenvectors. In this paper, an N d rectangular matrix containing column vectors (not nec essarily eigenvectors) spanning the ISUB of dimension d, will be referred to as S. If the column vectors are orthonormal, then one can easily project out a smaller d d real and symmetric matrix C (known as a Rayleigh matrix), i.e., S T AS = C (4.1)

where T designates the transpose. The advantage gained is that the much smaller C matrix (d N ) can then be numerically diagonalized using direct standard eigen value techniques to obtain the eigenvalues of A that correspond to the eigenvectors of A contained within the ISUB. Alternatively, one might choose not to orthogonalize successive column vectors of S. In this case, one must solve the generalized eigenvalue problem Cx = M x , (4.2)

where C comes from Eq. (4.1) (no longer referred to as a Rayleigh matrix), M = S T S is the overlap matrix, and (, x) is the (eigenvalue,eigenvector) pair. In practice, one is often unable to obtain an exact ISUB, but rather an approximate ISUB through numerical means. For example, a common technique is the Lanczos 78

method where the approximate ISUB (or Krylov subspace) is Kw = span(b, Ab, A2 b, . . . , Aw1 b) , (4.3)

where b is an initial column vector (usually random) and w is the dimension of the subspace. If one orthonormalizes the vectors spanning Kw in between successive matrix-vector products, and then combines the vectors after all w 1 iterations to make a new N w matrix, Kw1 (subscript denotes the number of iterations), then the Rayleigh matrix, C, is simply found via Eq. (4.1) with Kw1 replacing S. Particular to the Lanczos method, C has the favorable property of being tridiagonal. Not only is the Rayleigh matrix easier to diagonalize as a result, but a clever algorithm can be implemented requiring minimal storage of only four vectors throughout the iterations: the growing main and adjacent diagonal of C and two adjacent column vectors of Kw1 . 78 Mathematically, the Lanczos algorithm above produces orthogonal vectors spanning the Krylov subspace; due to nite numerical precision however, in practice, orthogonality is compromised after successive iterations, leading to computed eigenvalues with extra multiplicities, known as spurious eigenvalues. 105 An occasional re-orthogonalization of all vectors in the Krylov subspace can remedy this, and there are methods such as selective 106 and partial re-orthogonalization 107,108 that do this eciently. Unfortunately, all of the column vectors of Kw1 need to be stored, instead of just the four mentioned above, which may thus become a major issue. Wu and Simon have addressed the storage problem using two separate strategies. First, they have developed a thick-restart Lanczos method 109 where after the memory is completely lled by the growing Krylov subspace, the Ritz vectors (or eigenfunctions of the Rayleigh matrix at that point) are calculated, and all are used to develop a starting point for a new and more accurate Krylov subspace, which replaces the old. Second, they have developed a parallel algorithm, PLANSO, 110 where the Lanczos vectors are uniformly and conformally mapped among compute nodes, and all of the linear algebra operations needed in the Lanczos algorithm are parallelized to 79

accommodate the distribution. Popular sparse and parallel packages can be used to interface with PLANSO in order to handle sparse operations such as matrix-vector multiplication. In principle, there are numerous strategies one might develop for parallelizing the Lanczos method. For the most part, however, these only address the parallelization of the linear operations required of individual Lanczos iterations, i.e., multiple vectors are non-parallelizable, owing to the sequential nature of the Krylov subspace methods. Block Lanczos methods 111 [as opposed to vector Lanczos methods of Eq. (4.3)], where each iteration involves a group of orthonormal vectors instead of a single vector, does oer a design conducive to parallelization at the vector level, although the eectiveness of the parallelization is limited by the size of the blocks, which in practice is relatively small. In order to fully take advantage of the benets oered by parallelization at the vector level, one must completely eliminate the sequential aspect of the iterations. Thus, we chose a dierent iterative method that oers this, the subspace iteration (SI) method, where the approximate ISUB is Z d = span(Ar b1 , Ar b2 , . . . , Ar bd ) . (4.4)

The set of column vectors (b1 , b2 , . . . , bd ) must, at least, be linearly independent or else all of the vectors spanning Z d (which comprise the N d matrix Zr ) might converge to the same eigenvector of A as r increases. In practice, the total number of matrix-vector products of the SI method, r d, exceeds that of the vector Lanczos, w1, when achieving similar eigenvalue accuracies of A, but parallelizing Zr such that each column vector is calculated on separate nodes, eectively reduces the number of matrix-vector products to r, which is considerably smaller than w 1. Parallelization of the block Lanczos methods can achieve similar savings as the SI method only if the block size of the former is the same as d, although this introduces problematic memory issues for the latter method since its approximate ISUB has a large w d dimension.

80

The SI method also oers other key advantages over both vector and block Lanczos methods. For example, one has better control over the dimension of the approximate ISUB. In other words, one directly chooses d in the SI case, and independently adjusts the accuracy via the iteration number r, whereas in the vector and block Lanczos cases, the dimensionality and number of iterations are both dependent upon the same parameter w. In practice, good iteration number and dimension parameter values require a careful balance of factors. Another advantage of the SI method over Lanczos, in particular the vector Lanczos, is that the former does not become inecient in cases where the eigenvalue spectrum is degenerate or nearly-degenerate. This situation is common in spectroscopic rovibrational calculations, particular when there is symmetry. 15,112,113 The block Lanczos methods do correctly account for degeneracy as long as the individual block size dimensions are greater than or equal to the extent of the degeneracy. Last, in the SI method, the approximate ISUB approaches that of the exact ISUB when not considering numerical error issues, i.e., Zr S as r , which is not the case for both the vector and block Lanczos methods. Although the SI method is clearly a natural candidate for parallelization, the incorporation of vector orthogonalization (either after every matrix-vector product, or just occasionally) does require substantial internode communication. In this chapter, we present a way to preprocess A such that a satisfactory ISUB can be iteratively computed without the need for costly orthogonalizations. Communication is only re quired after the nal iteration, in order to calculate the elements of C using Eq. (4.1) (Zr replaces S), as well as the elements of M (column vectors of Zr are nonorthogo nal) and to initialize both C and M on a single node for direct diagonalization using Eq. (4.2). These preprocessing ideas are inspired by single-particle density matrix purication (DMP) schemes used in ground-state electronic-structure calculations. 70,71 In the DMP context, the matrices are often suciently small as to allow direct matrixmatrix multiplications to be performed, e.g., PRISM (Parallel Research on Invariant 81 (4.5)

Subspace Methods). 114,115 For many SI applications, however, e.g. quantum dynamics, the matrices involved are large and sparse, and therefore not amenable to direct matrix-matrix products. Instead, our approach uses matrix-vector products, which are far less costly, and also enable ecient sparse matrix techniques to be employed. In this chapter, we apply the new preprocessed SI method to several model systems: isotropic (3 and 6 degrees of freedom or DOFs) and anisotropic (3 DOF) uncoupled harmonic oscillators (HOs). We represent the Hamiltonian operator H of these systems using a type of Weyl-Heisenberg wavelet (or weylet) basis chosen under the guidance of a phase space truncation scheme 3537 which gives N N sparse matrix representations, H, of the system. In all cases, the eigenvalues of H are calculated from an approximate ISUB of dimension d = 6 000, and a determination is made of how many of these fall within a certain relative accuracy level. Although the 6 000 6 000 matrix C is diagonalized directly, the expectation is that N so that the overall calculation is still extremely ecient. In practice, we nd that towards the lower end of the spectrum of the 6 000, the eigenvalues match those of H with increasing accuracy. The greatest errors are towards the high end of the spectrum, as is typical for subspace methods. However, the SI method enables one to improve the accuracy of all d = 6 000 eigenvalues, in principle to arbitrary precision, simply by increasing r, because of Eq. (4.5). In practice, we nd limitations on the maximum value of r that can be applied. In such cases, we can still improve the accuracy of the desired eigenvalues arbitrarily, simply by increasing d (at the expense of using more memory on the nodes), as per other subspace methods. The new SI method is also found to be extremely scalable. With respect to storage requirements, the iteration vectors are found to require most of the memory, but these can be evenly distributed over all available nodes. With respect to CPU operations, the bottleneck is the iterative generation of the ISUB vectors, which exhibits near perfect parallel speedup, as it requires no internode communication. The subsequent C matrix creation and initialization steps also parallelize eciently, 82 6 000,

although communication is involved. Having applied the new method in this paper to matrices as large as N 106 , we nd it to be very eective even in its present incarnation, although there is still ample room for future improvement and ne tuning. 4.2 Theoretical Background The SI method is considered to be a variation of the power method, for which, instead of focusing on the attainment of one eigenvector, one desires to nd a group of eigenvectors corresponding to some region of the eigenvalue spectrum. For the SI case presented in the previous section [Eq. (4.4)], the fully converged Zr (r ) would be the space spanned by the d eigenvectors of A that have the largest eigenvalue moduli. Traditional SI methods orthogonalize the ISUB vectors after each matrix-vector product via a QR decomposition or a modied Gram-Schmidt. This prevents Z d from losing dimensionality, i.e., the new Zr with orthogonal column vectors maintains full rank d. In addition, the projected matrix C found by Eq. (4.1) is a Rayleigh matrix, and the eigenvalues are found via direct standard eigenvalue techniques instead of using Eq. (4.2). This approach is preferable when r and d are small, otherwise it is too compu tationally expensive. Our strategy, on the other hand, is to preprocess A such that the orthogonalizations may be avoided altogether, thus making it possible to easily parallelize the complete algorithm. We are not currently able to eliminate the rank problems completely; however, using the present preprocessing scheme, we can ac curately retrieve a signicant portion of the eigenvalue spectrum of A. Also, since our chief interest lies in calculating rovibrational energies of small molecules (A now becomes the Hamiltonian matrix H), our preprocessing scheme focuses on the lower, most accurate portion of the eigenvalue spectrum. There are two steps in the preprocessing. First, the rows and columns of H are reordered so as to place diagonal elements of H in ascending order, moving from the top-left corner towards the bottom-right. The second step, commonly known as

83

scaling, is the following simple adjustment of H: = (I H) + I 1 1 , where = min . min max H (4.6)

The N N matrix I is the identity, and is known as the chemical potential in DMP papers 71 (the signicance will be addressed later). The parameters and
min max

are the approximate largest and smallest eigenvalues, respectively, found via

Gershgorins formulas, 116


max

= max Hii +
j=i

|Hij |

(4.7)
i

and
min

= min Hii
j=i

|Hij |

.
i

(4.8)

No eigenvalues of H stray outside of the above boundaries. The eigenvalues of H range between 0 and 2, with those between 1 and 2 cor responding to the desired eigenvalues of H below . The parameter should be chosen so that d (dimension of ISUB) eigenvalues of H lie within the latter range. In practice, one rst decides on d, and then determines a suitable value after a few trial runs starting from a reasonable initial guess. The matrix H = A is then used to obtain the approximate ISUB via Eq. (4.4) with the initial vectors bi chosen to be the orthonormal unit vectors zi (with the ith component equal to 1, and all other components 0). The approximate ISUB can therefore be written (H )r Z0 = Zr where r is the number of iterations, and Z0 = (z1 , z2 , . . . , zd ). Mathematically, as r , Zr S, as mentioned before. Only matrix-vector products are performed so that one has only to deal with the d column vectors of matrix Zr and the sparse matrix H . In contrast, matrix-matrix products of H would lead to a dense N N matrix that is too large to store on each node. The rationale here is that the Zr contribution from the subspace of eigenvectors corresponding to the eigenvalues of H (the original matrix H has the same eigenvectors) between 0 and 1 will dissipate exponentially 84

with respect to r. By the same token, the [1, 2] contribution becomes increasingly prominent with increasing r (see Appendix D). Since the column vectors of Zr are not orthogonal, then the eigenvalues are ob tained by Eqs. (4.1) and (4.2). This requires that the overlap matrix M be positive denite, which is true formally if Zr is of full rank, but can cause numerical instabili ties if the smallest M eigenvalues are too close to zero. The reordering of H discussed earlier, together with the choice bi = zi , is designed to ameliorate this diculty. It is certainly found to be very successful in this regard; however, one is still limited in the number of iterations, r, that can be used without numerical instabilities arising in the generalized eigenvalue solution routines. This limitation, in turn, restricts the accuracy that can be achieved for a given value of d. 4.3 Parallel and Numerical Implementation In this section, we describe the numerical algorithm in detail, including parallelization. The implementation can be broken down into 7 sequential steps as follows: 1. Construct H using the truncated weylet basis. 2. Preprocess H to get the new matrix H . 3. Distribute d vectors zi of Z0 across nodes; duplicate H across nodes. 4. Perform r iterations in parallel, in order to compute Zr = (H )r Z0 . 5. Calculate the elements of C and M while keeping the d Zr vectors distributed among the nodes. 6. Send all of the C and M matrix elements to a single node. 7. Solve the generalized eigenvalue problem of Eq. (4.2) on a single node. In this study, we only look at separable systems, e.g., isotropic and anisotropic uncoupled HO where the Hamiltonian operator is H = (1/2)
f 2 2 (2 /mj + mj j qj ) pj j=1

(4.9)

85

(all of the masses are set to unity, i.e., mj = 1 for j = 1, . . . , f ). The 1 DOF weylet lattice basis ( = 1 assumed throughout chapter) used to represent H is h st (q) = (1)( 2 +mt) cmn uv (q)
|m|+|n| 6
n

(4.10)
2 (qu/a)2 /2

where uv (q) = (4a2 /)(1/4) cos va q (u+1/2) /a

ea

, (4.11)

u = s + m, v = t + n, m and n are even integers, cmn are coecients with values reported in Ref. [36], a is related to the aspect ratio, s (half-integer) is the position index of the weylet block on phase space, and t (positive half-integer) is the momentum parameter. For f DOFs, the basis consists of products of the 1 DOF functions, i.e.,
f

s,t (q) =
j=1

sj tj (qj )

(4.12)

where q = (q1 , q2 , . . . , qf ), s = (s1 , s2 , . . . , sf ), and t = (t1 , t2 , . . . , tf ). Step 1 , the creation of the Hamiltonian matrix H, is very quick since the matrix elements are analytical. The 1 DOF kinetic energy matrix has elements derived from st |2 |s t = p
|m|+|n| 6 |m |+|n | 6

(1)( 2 +mt+ 2 +m t ) cmn cm n uv |2 |u v p

(4.13)

with uv |2 |u v = p and h(u, v, u , v ) = e(/4)(u +v )


2 2

a2 [h(u, v, u , v ) h(u, v, u , v )] , 2 1 2 2 v + u2 + 2

(4.14)

cos u v+ sin .

(4.15)

The and + subscripts indicate the dierence and addition, respectively, of the bra and ket indices, e.g., u = u u and v+ = v + v . Note that under a change of sign of v [i.e., for the last term of Eq. (4.14)], v becomes v+ and vice-versa. The phase quantity is given by (u, v, u , v ) = (u v+ + v ) , 2 (4.16)

86

and since u , v , and v+ are always integers, the trigonometric quantities in Eq. (4.15) are always 1 or 0. The 1 DOF potential energy matrix elements are obtained from uv |2 |u v = q and (u, v, u , v ) = e(/4)(u +v )
2 2

[ (u, v, u , v ) (u, v, u , v )] , 2a2 1 2 2 2 u+ v + 2

(4.17)

cos + v u+ sin .

(4.18)

With regard to the full f DOF matrix representation, the separability of H results in a sparse matrix H, especially for large f . The full f DOF kinetic energy matrix is given by

(1/2) s,t |2 + p2 + . . . + p2 |s ,t p1 2 f

= (1/2)
(i,j,...,k)=(1,2,...,f )

si ti |2 |si ti pi

sj sj tj tj . . . sk sk tk tk

(4.19)

where the summation is over all cyclic permutations of (1, 2, . . . , f ), is the Kronecker delta function, and si ti |2 |si ti is dened in Eqs. (4.13)-(4.15). The potential pi energy matrix elements adhere to a similar sparse form as Eq. (4.19) except that q s are put in the place of ps. To maximize storage eciency and to speed up linear algebra operations needed later, the row-indexed sparse storage mode 16 is used for H. The nonzero elements of H are stored directly in a 1-dimensional array (SA) accompanied by another 1dimensional array (IJA) of integer parameters containing the matrix positions of the nonzero elements. Both of the arrays contain G + 1 elements, where G is the number of nonzero elements (the extra storage slot is needed for the storage mode). The good phase space localization of the weylets, along with the orthonormality, allows for a phase space truncation scheme which defeats exponential scaling. In 87

other words, the eciency K/N , where N represents the number of basis functions (order of H) needed to calculate K eigenvalues at a desired accuracy, does not decay exponentially with respect to f , the number of DOFs of the system in question. 3537 More specically, the truncation involves selecting only those weylets, whose center coordinates on phase space when plugged into the classical Hamiltonian expression, produce values Hmid below some energy cuto Ecut . The justication of the scheme is based upon the quasiclassical approximation which treats the set of weylets as a lattice of 2f -dimensional blocks partitioning phase space, and the target K eigenstates of H are represented as a uniform region. This approximation improves as both K and N increase. 34 Due to the sparsity of H, the preprocessing in step 2 is very quick, as well, and the sparsity is preserved exactly in the new matrix H (which fully replaces H in the 1-dimensional arrays). The reordering of H and the calculation in Eq. (4.6) using the Gershgorins formulas in Eqs. (4.7) and (4.8) take full advantage of the sparsity and the row-indexed sparse storage mode. This allows the step to be an insignicant contribution to the total CPU time required. In step 3, the 1-dimensional array SA(1 : G + 1) containing the elements of H , along with the integer array IJA(1 : G + 1) are copied onto each of the g compute nodes. The strategy is that the iterations, i.e., H Zr1 = Zr , needed to create the ISUB will be done in parallel by distributing the column vectors of Z0 as equally as possible over the g nodes. Simultaneously, each node performs matrix-vector products with H and its assigned vectors. In practice, instead of apportioning Z0 at the start of the process, one can simply divvy up the rst d columns of H which is equivalent to Z1 . For clarity, we assign a number y for each node ranging from 0 to g 1. For communication purposes, as shown in Fig. 4.1, the nodes are arranged in a loop, with the column vectors stored sequentially around the loop. If d > g, then the distribution will continue looping around in the same manner. With this arrangement, node y will have either y = [[d/g]] + 1 vectors ([[ ]] denotes greatest integer smaller than, 88

also known as the oor function) if y + 1 mod(d, g) = d g[[d/g]], otherwise y = [[d/g]] vectors. Step 4 involves the r matrix-vector products between H and the y vectors on each node y done in parallel. Multiple trial runs are needed to nd the largest number of iterations, r, the point before the numerical generalized eigenvalue solver fails (or eigenvalues of M are too close to zero) due to loss of rank. One can incorporate a singular value decomposition method if one exceeds r; although, one nds that no accuracy is gained beyond the largest iteration point. After the r iterations are completed, step 5, the construction of matrices C and M , is eected. The simplest method would be to transfer all of the dense vectors of Zr to a single node; although, it is more than likely that each node does not have enough memory to handle Zr . Instead, an ecient communication scheme is employed with [[g/2]] stages. Before any communication in step 5, two important steps are implemented. First, H is unpreprocessed back to H which is used for the calculation of C. Second, all matrix elements of C and M corresponding to the local set of y vectors are calculated. For example, in Fig. 4.1, node y = 0 contains column vectors z1 , z6 , and z11
(r) (r) (r)

of Zr . Matrix elements (1, 1), (6, 6), (11, 11), (6, 1), (11, 1), and (11, 6),

where the rst and second numbers in each set (i, j) is the row and column of C and
(r) (r) (r) (r) M , are obtained by the matrix products (zi )T Hzj and (zi )T zj , respectively.

Note that since C and M are symmetric, we only need to calculate those elements in the lower (or upper) triangular portion of the matrices. In the rst stage of the communication, node y sends y vectors, one at a time, to node mod(y + 1, g), and node y receives mod(y1,g) vectors from node mod(y 1, g). Immediately, after each vector is received, all of the possible matrix elements between the transferred vector and the local vectors belonging to the receiving node are calculated and stored. For example, in Fig. 4.1, node 1 will receive z1
(r)

from node 0 and

will calculate elements (2,1), (7,1), and (12,1), followed by the immediate deletion of

89

z1

(r)

in order to save memory. Vectors z6

(r)

and z11 from node 0 undergo the same

(r)

process. The second stage involves node y sending y vectors to mod(y +2, g) and receiving mod(y2,g) vectors from node mod(y 2, g). In general, one can conclude for stage v that y sends y vectors to mod(y + v, g) and receives mod(yv,g) vectors from node mod(y v, g). When dealing with an odd number of nodes, this generalization is true for all [[g/2]] stages. On the other hand, for even g, the generalization is true except for the last stage g/2. As shown in Fig. 4.2, the nodes in each of the g/2 pairs take turns in sending a single vector to the other. In the rst substage, one vector from one of the nodes in the pair will be sent to the other and will be combined with all vectors present on the receiving end for the calculation of all possible matrix elements. Next, the sender and receiver switch roles, and one vector is sent in the other direction, to be combined with vectors that have not been sent in previous substages. Fig. 4.2 explicitly shows the pattern for the d = 13 and g = 6 case. It is important that we tally all of the arrays needed for each node in this algorithm such that, based on the known limitations of memory on each node, we can determine the minimum bound of gany number of nodes larger than or equal to the minimum of g can be used for the successful handling of the target problem. Each node will contain SA(1 : G + 1) and IJA(1 : G + 1) which should not be a signicant memory consumer due to the sparsity of H; although, the size of G (number of nonzero elements), in our chosen model problems, does grow at a slightly faster than linear rate with N and could overstep the memory limitations for very large N . In the future, we may investigate ways to distribute these arrays among the nodes. At the present, however, the largest memory consumer is the set of vectors that make up Zr which we will denote as the 2-dimensional array vec(1 : N, 1 : y ), specic for the node y. Obviously, as one increases g, then y is reduced; thus, we have direct control of the size of vec by varying g. Each node also needs an additional column vector recvec(1 : N ) which is the vector that each node receives during the aforementioned communication steps for the purpose of calculating matrix elements of C and M . 90

Last, elements from the two matrices need to be stored on each node. Fortunately, these are also distributed almost evenly among the nodes, though, in step 6, all such elements will be transferred to one node. For step 5, a safe shift algorithm 117 is used on each node to implement the communication in a timely fashion. There are two key elements in the method that add ow control to the message passing, which is important, especially when communicating large messages. First, a nonblocking receive command in the code is posted before the send command. This is known as preposting the message, i.e., the nodes are ready to receive any message before anything is sent. Second, a small message is sent in the reverse direction which is known as a permission to send (PTS) message. The receiver sends a small PTS message to the sender after the nonblocking receive command (or the preposting) which opens up the pathway for the communication of the actual data. Finally, step 7 is performed, i.e., the generalized eigenvalue problem [Eq. (4.2)]. This is done on one node using the LAPACK subroutine DSYGV. In the future, if we want to recover a large number of eigenvalues, i.e., d is large, then we will need to incorporate a parallel dense linear algebra solver, such as specied in the PRISM project. 114,115 4.4 Results and Discussion First, we considered the f = 3 DOF isotropic case, where 1 = 2 = 3 = 1 [Eq. (4.9)]. For the Hamiltonian matrix H of order N = 36 083, we chose d = 6 000. For comparison, the exact eigenvalues of the HO Hamiltonian operator, H, are
f j=1

nj + f /2 where nj is the nonnegative integer signifying the energy level


of the j th DOF of the HO. The degeneracy for each level is deg(nS , f ) = nS + f 1 f 1 (4.20) for f > 1 where nS =
f j=1

nj ; the eigenvalues are nondegenerate for f = 1. 91

Table 4.1 reports the number of eigenvalues, K, out of the lowest d = 6 000, computed to various relative accuracies (2102 , 2103 , 2104 , etc., correspond to error tolerances of (0.2)f , (0.02)f , (0.002)f ,. . ., in a.u. as per Ref. [37]). Eigenvalues for both Eq. (4.2) (computed using the proposed parallel algorithm of Sec. 4.3) and for the 36 083 36 083 matrix H (the LAPACK DSYEV subroutine) are considered, with errors taken relative to the exact analytical eigenvalues of H (described above). At relative accuracies 2 105 or better, there is a perfect match between the two calculations, indicating that the proposed method introduces no substantial error beyond those of H itself. For relative accuracies of 2 104 and above, the small discrepancies indicate that the r value chosen (r = 37) is insuciently large to achieve exact agreement for the highest eigenvalues. The value r = 37 is the largest that can be used without encountering numerical instabilities in the eigensolver routines due to the loss of full rank. Though quite small, this value nevertheless suciently converged, with respect to achieving nearly one full accuracy of H itself throughout the spectrum. Lanczos, for instance, would not achieve anywhere near the performance of Table 4.1. The 3 DOF system is studied further as shown in Fig. 4.3. The numbers of eigenvalues, K, that have relative accuracies of 2 102 , 2 104 , 2 106 , and 2108 with regard to the eigenvalues of H, are plotted against N for xed d = 6 000. As N increases, the largest iteration number r increases, as well, e.g., at N = 21 976, 36 083, 49 840, 104 912, 207 320, and 416 840, the values of r are 33, 37, 43, 59, 75, and 95, respectively. The 2 102 relative accuracy curve is basically at at all values of N , and nearly equal to the full d = 6 000. Clearly, a large basis size N is not required to compute the desired eigenvalues to this low level of accuracy. However, the method becomes increasingly useful for higher accuracy calculations, for which much larger N values are required, but CPU eort (since d is still 6 000) increases only modestly. In general, K increases with N as expected, but only up to a point, beyond which K is essentially at, due to the limitations on r. For the higher accuracy curves, the point at which the curve attens is at a larger N . For

92

example, at 2 106 and 2 108 the attening does not occur until N = 104 912 and 207 320, respectively. We also looked at the 3 DOF anisotropic case where 1 = 2, 2 = 3, and 3 = 5 a.u. [Eq. (4.9)]. For the isotropic case, the aspect ratios for each DOF (a1 , a2 , and a3 ) are all unity, but in the anisotropic case, for optimal eciency of the weylet basis, a1 = 1 , a2 = 2 , and a3 = 3 . The eigenvalues of the Hamiltonian operator, H, are non-degenerate and equal to
f j=1

j (nj + 1/2) .

Comparing between Figs. 4.3 and 4.4, we note that, although the isotropic and anisotropic cases demonstrate similar patterns with respect to the dierent relative accuracies, the isotropic calculation is slightly more accurate. For example, at 2104 and 2 106 the curves in the isotropic case reach a maximum number of eigenvalues at around K = 4 000 and 2 700, respectively; whereas, the same curves in the anisotropic case atten out at K = 3 500 and 2 400. However, we have found that incorporating coupling into the problem does not aect the eciency of the weylet basis. More specically, we did a small study (data not reported) on a coupled anisotropic system [obtained by adding (1 q2 + q1 q3 + q2 q3 ) to Eq. (4.9) with f = 3 q and the coupling parameter = 0.1] and found scarcely any dierence in the number of accurately computed eigenvalues. The coupling does reduce sparsity somewhat, however, resulting in increased computation time. We have also applied the parallel algorithm to the 6 DOF isotropic uncoupled HO system at xed d = 6 000. Fig. 4.5 indicates that two of the curves (2 103 and 2 104 ) have monotonic behavior over the N range considered, i.e., it would be fruitful to consider basis sets even larger than N = 106 here, which is not surprising, though we have not done so. With a C of order d = 6 000 representing H at N = 977 789, we were able to extract an impressive 4 419 eigenvalues at 2 103 relative accuracy and 1 184 at 2 104 . Another strategy for increasing accuracy that we have not considered explicitly is to increase d. With larger d, more nodes will be needed in order to accommodate increased memory requirements; although, it also turns out that the largest number 93

of iterations, r, is smaller. While this reduces CPU eort somewhat, it also nullies to some extent the gain in accuracy. The loss of rank problem might also be remedied by the addition of some useful steps in the algorithm. We feel that incorporating an orthogonalization scheme at strategic moments during the iteration step 4 could help Zr to achieve full rank. This would lead to C more accurately reecting H. This scheme would require some costly communication similar to that of step 5, although this would not be required after every iteration but possibly after every 10 or 20 iterations or so. For Hamiltonian matrices of size N larger than what was considered in this chapter, we believe that this modication of the proposed method would be worth investigating.

94

Table 4.1. Number of accurately computed eigenvalues, K, out of the lowest d = 6 000 that fall within some relative accuracy (column 1) for the 3 DOF isotropic HO (N = 36 083) using either Eq. (4.2) at r = 37 (column 2) or the direct diagonalization of H (column 3). By comparing columns 2 and 3, one can assess the additional error introduced by the parallel algorithm. accuracy parallel 2 102 2 103 2 104 2 105 2 106 2 107 2 108 5 654 4 790 3 577 1 653 585 59 6 direct 6 000 6 000 4 008 1 653 585 59 6

95

0 1,6,11 1 2,7,12
d d s d d

0 1,7,13 4 5,10
 '

5 6,12
s d d

1 2,8

5,11 2 3,9 0 1,7,13


E  3

2
E

3 4,9

d d

3,8,13

4,10 5 6,12

0 1,6,11 1 2,7,12
u e e e e ' e B rr e rr e r r  e r j r

4 5,10

1 2,8

r T r rr rr rr % rr B rr rr c r j r 2 3

4 5,11

3,8,13

4,9

3,9 0 1,7,13
t t '  0 

4,10 5 6,12

1 2,8

 ) 2 

t  t  t t  t t 

4
E

5,11

t t

3 4,10

3,9

Figure 4.1. Node communication setup, with the rst and second columns representing the g = 5 and 6 case, respectively, both with d = 13, where g represents the number of nodes and d is the dimension of the approximate ISUB, Zr . The numbers inside the boxes (representing nodes) indicate which column vectors of the N d matrix Zr are stored accordingly, and the numbers outside are the labels y of the nodes. The arrows designate the communication direction in each stage (rst stage is in the top row). For the g = 6 case, the last stage has communication in both directions between the pairs of nodes which is further explained in Fig. 4.2.

96

0
m 1 7 13  

3
  m 4 10 

1
m 28 m 28 m 28  -

4
  m 5 11 

4 10

5 11

1 7 13

  

m 1 7 13

4  10 4  10


5  11

1 7  13

2
m 39 m 39 m 39  -

5
  m 6 12 

6 12

6  12

Figure 4.2. Node communication setup for the last stage (g = 6) with d = 13 (from example in Fig. 4.1). The pairs of nodes {0, 3}, {1, 4}, and {2, 5}, go through a two-way communication process. The circles highlight those specic vectors that are sent in the direction of the arrow and involved in the element calculations of C and M . For example, in the rst step of (r) the {0, 3} node pair, z1 is sent followed by the calculation of elements (4, 1) and (10, 1).

97

6000

Number of Eigenvalues

4000

2000

N x 100,000

Figure 4.3. Number of eigenvalues, K, at a relative accuracy versus N for the 3 DOF isotropic HO (d = 6 000). In general, the number of accurate eigenvalues increases with the growth of N . The solid line represents the number of eigenvalues with a relative accuracy of 2 102 , dotted line 2 104 , dashed line 2 106 , and the long dashed line represents the most accurate eigenvalues at 2 108 .

98

6000

Number of Eigenvalues

4000

2000

N x 100,000

Figure 4.4. Number of eigenvalues, K, at a relative accuracy versus N for the 3 DOF anisotropic HO (d = 6 000). The set up is the same as that reported in Fig. 4.3.

99

6000

Number of Eigenvalues

4000

2000

10

N x 100,000

Figure 4.5. Number of eigenvalues, K, at a relative accuracy versus N for the 6 DOF isotropic HO (d = 6 000). The solid line represents the number of eigenvalues at 2 102 relative accuracy. The dotted and dashed line reect higher accuracies at 2 103 and 2 104 , respectively.

100

REFERENCES [1] G. Scoles, D. Bassi, U. Buck, and D. C. Lain, eds., Atomic and Molecular e Beam Methods (Oxford University Press, Oxford, 1988). [2] G. W. M. Vissers, G. C. Groenenboom, and A. van der Avoird, J. Chem. Phys. 119, 277 (2003). [3] R. E. Miller, Acc. Chem. Res. 23, 10 (1990). [4] G. W. M. Vissers, G. C. Groenenboom, and A. van der Avoird, J. Chem. Phys. 119, 286 (2003). [5] D. C. Dayton, K. W. Jucks, and R. E. Miller, J. Chem. Phys. 90, 2631 (1989). [6] E. J. Bohac, M. D. Marshall, and R. E. Miller, J. Chem. Phys. 96, 6681 (1992). [7] B. M. Smirnov, Clusters and Small Particles (Springer, New York, 2000). [8] M. L. Mandich, AMO Physics Handbook (American Institute of Physics, 1996), chap. Clusters, p. 452. [9] R. S. Berry, J. Jellinek, and G. Natanson, Phys. Rev. A 30, 919 (1984). [10] P. A. Frantsuzov, D. Meluzzi, and V. A. Mandelshtam, Phys. Rev. Letts. 96, 113401 (2006). [11] A. Stace, Science 294, 1292 (2001). [12] P. Kebarle, Annu. Rev. Phys. Chem. 28, 445 (1977). [13] J. K. L. MacDonald, Phys. Rev. 43, 830 (1933). [14] Z. Bai and J. C. Light, Annu. Rev. Phys. Chem. 40, 469 (1989). cc [15] J. Montgomery and B. Poirier, J. Chem. Phys. 119, 6609 (2003). [16] W. H. Press et al, Numerical Recipes in Fortran 77: The Art of Scientic Computing (Cambridge University Press, Cambridge, England, 2001), 2nd ed. [17] I. P. Hamilton and J. C. Light, J. Chem. Phys. 84, 306 (1986). [18] Z. Bai and J. C. Light, J. Chem. Phys. 85, 4594 (1986). cc [19] Z. Bai and J. C. Light, J. Chem. Phys. 86, 3065 (1987). cc [20] Z. Bai, D. Watt, and J. C. Light, J. Chem. Phys. 89, 947 (1988). cc [21] A. C. Peet, J. Chem. Phys. 90, 4363 (1989). 101

[22] S. Garashchuk and J. C. Light, J. Chem. Phys. 114 (2001). [23] T. Gonzalz-Lezana, J. Rubayo-Soneira, S. Miret-Arts, F. A. Gianturco, e e G. Delgado-Barrio, and P. Villarreal, J. Chem. Phys. 110, 9000 (1999). [24] B. Poirier and J. C. Light, J. Chem. Phys. 113, 211 (2000). [25] F. Gygi, Phys. Rev. B 48, 11692 (1993). [26] F. Gygi, Phys. Rev. B 51, 11190 (1995). [27] E. Fattal, R. Baer, and R. Koslo, Phys. Rev. E. 53, 1217 (1996). [28] V. Kokoouline, O. Dulieu, R. Koslo, and F. Masnou-Seeuws, J. Chem. Phys. 110, 9865 (1999). [29] B. Poirier and J. C. Light, J. Chem. Phys. 111, 4869 (1999). [30] M. Cargo and R. G. Littlejohn, Phys. Rev. E 65, 026703 (2002). [31] J. M. Bowman, Comp. Phys. Commun., Special Issue on Molecular Vibrations 51, 225 (1988). [32] H. Weyl, Z. Phys. 46, 1 (1928). [33] E. Wigner, Phys. Rev. 40, 749 (1932). [34] B. Poirier, Found. Phys. 30, 1191 (2000). [35] B. Poirier, J. Theo. Comput. Chem. 2, 65 (2003). [36] B. Poirier and A. Salam, J. Chem. Phys. 121, 1690 (2004). [37] B. Poirier and A. Salam, J. Chem. Phys. 121, 1704 (2004). [38] J. R. Klauder and B.-S. Skagerstam, Coherent States: Applications in Physics and Mathematical Physics (World Scientic Publishing Co., Singapore, 1985). [39] S. Szab, P. Adam, J. Janszky, and P. Domokos, Phys. Rev. A 53 (1996). o [40] A. Kenfack, J. M. Rost, and A. M. Ozorio de Almeida, J. Phys. B: At. Mol. Opt. Phys. 37, 1645 (2004). [41] M. J. Davis and E. J. Heller, J. Chem. Phys. 71, 3383 (1979). [42] J. von Neumann, Mathematical Foundations of Quantum Mechanics (Princeton University Press, New Jersey, 1932). [43] M. Boon and J. Zak, Phys. Rev. B 18, 6744 (1978). 102

[44] J. Zak, J. Phys. A: Math. Gen. 34, 1063 (2001). [45] L. K. Stergioulas and A. Vourdas, J. Mod. Opt. 45 (1998). [46] L. K. Stergioulas, V. S. Vassiliadis, and A. Vourdas, J. Phys. A: Math. Gen. 32 (1999). [47] L. M. Arvalo Aguilar and H. Moya-Cessa, Phys. Scr. 70, 14 (2004). e [48] D. Gabor, J. Inst. Electr. Engin. 93, 429 (1946). [49] V. Bargmann, P. Butera, L. Girardello, and J. R. Klauder, Rep. Math. Phys. 2, 221 (1971). [50] A. M. Perelomov, Theor. Math. Phys. 6, 156 (1971). [51] H. Bacry, A. Grossmann, and J. Zak, Phys. Rev. B 12, 1118 (1975). [52] P.-O. Lwdin, Adv. Phys. 5, 1 (1956). o [53] R. Balian, C. R. Acad. Sc. Paris 292, 1357 (1981). [54] F. Low, Complete Sets of Wave-Packets (World Scientic, Singapore, 1985), pp. 1722. [55] K. G. Wilson, Generalized wannier functions, Cornell University preprint, 1987. [56] I. Daubechies, S. Jaard, and J. Journ, SIAM J. Math. Anal. 22, 554 (1991). e [57] D. O. Harris, G. G. Engerholm, and W. D. Gwinn, J. Chem. Phys. 43, 1515 (1965). [58] A. S. Dickinson and P. R. Certain, J. Chem. Phys. 49, 4209 (1968). [59] J. V. Lill, G. A. Parker, and J. C. Light, Chem. Phys. Lett. 89, 483 (1982). [60] R. W. Heather and J. C. Light, J. Chem. Phys. 79, 147 (1983). [61] J. V. Lill, G. A. Parker, and J. C. Light, J. Chem. Phys. 85, 900 (1986). [62] J. C. Light and T. Carrington Jr., Adv. Chem. Phys. 114, 263 (2000). [63] R. Dawes and T. Carrington, Jr., J. Chem. Phys. 122, 134101 (2005). [64] M. J. Bramley and T. Carrington, Jr., J. Chem. Phys. 99, 8519 (1993). [65] S. Carter and N. C. Handy, Comput. Phys. Rep. 5, 115 (1986). [66] L. Halonen, D. W. Noid, and M. S. Child, J. Chem. Phys. 78, 2803 (1983).

103

[67] L. Halonen and M. S. Child, J. Chem. Phys. 79, 4355 (1983). [68] A. J. Bracken, H.-D. Doebner, and J. G. Wood, Phys. Rev. Lett. 83, 3758 (1999). [69] J. G. Wood, Ph.D. thesis in physics, University of Queensland, St. Lucia 4072, Australia (2003). [70] R. McWeeny, Rev. Mod. Phys. 32, 335 (1960). [71] A. H. R. Palser and D. E. Manolopoulos, Phys. Rev. B 58, 12 (1998). [72] J. Echave and D. C. Clary, Chem. Phys. Lett. 190, 225 (1992). [73] H. Wei and T. Carrington, Jr., J. Chem. Phys. 97, 3029 (1992). [74] W. Bian and B. Poirier, J. Theo. Comput. Chem. 2, 583 (2003). [75] F. L. Qur and C. Leforestier, J. Chem. Phys. 92, 247 (1990). ee [76] B. Poirier and T. Carrington, Jr., J. Chem. Phys. 114, 9254 (2001). [77] G. H. Golub and C. F. Van Loan, Matrix Computations (Johns Hopkins University Press, Baltimore, 1989). [78] Y. Saad, Iterative Methods for Sparse Linear Systems (PWS, Boston, 2000). [79] R. L. Johnston, Atomic and Molecular Clusters (Taylor and Francis, London, 2002). [80] R. J. Gdanitz, Chem. Phys. Lett. 348, 67 (2001). [81] S. M. Cybulski and R. R. Toczylowski, J. Chem. Phys. 111, 10520 (1999). [82] A. West and F. Merkt, J. Chem. Phys. 118, 8807 (2003). u [83] G. C. Maitland, Mol. Phys. 26, 513 (1973). [84] R. J. Le Roy, M. L. Klein, and I. J. McGee, Mol. Phys. 28, 587 (1974). [85] R. A. Aziz, W. J. Meath, and A. R. Allnatt, Chem. Phys. 78, 295 (1983). [86] R. A. Aziz, W. J. Meath, and A. R. Allnatt, Chem. Phys. 85, 491 (1984). [87] R. A. Aziz and M. J. Slaman, Chem. Phys. 130, 187 (1989). [88] Y. Tanaka and K. Yoshina, J. Chem. Phys. 57, 2964 (1972). [89] Y. Tanaka and W. C. Walker, J. Chem. Phys. 74, 2760 (1981).

104

[90] D. N. Timms, A. C. Evans, M. Boninsegni, D. M. Ceperley, J. Mayers, and R. O. Simmons, J. Phys.: Condens. Matter 8, 6665 (1996). [91] Q. Wang and J. K. Johnson, Fluid Phase Equil. 132, 93 (1997). [92] J. P. Hansen and J. J. Weis, Phys. Rev. 188, 314 (1969). [93] D. Thirumalai, R. W. Hall, and B. J. Berne, J. Chem. Phys. 81, 2523 (1984). [94] B. R. Johnson, J. L. Mackey, and J. L. Kinsey, J. Comp. Phys. 168, 356 (2001). [95] J. P. Modisette, P. Nordlander, J. L. Kinsey, and B. R. Johnson, Chem. Phys. Lett. 250, 485 (1996). [96] A. Maloney, J. L. Kinsey, and B. R. Johnson, J. Chem. Phys. 117, 3548 (2002). [97] J. E. Moyal, Proc. Cambridge Phil. Soc. 45, 99 (1949). [98] M. Hillery, R. F. OConnell, M. O. Scully, and E. P. Wigner, Phys. Rep. 106, 121 (1984). [99] G. C. Carney, L. L. Sprandel, and C. W. Kern, Adv. Chem. Phys. 37, 305 (1978). [100] J. M. Bowman, J. Chem. Phys. 68, 608 (1978). [101] J. M. Bowman, Acc. Chem. Res. 19, 202 (1986). [102] R. Lombardini and B. Poirier, J. Chem. Phys. 124, 144107 (2006). [103] S. Flugge, Practical Quantum Mechanics (Springer-Verlag, New York, 1971), vol. 1, p. 94. [104] H. Chen, S. Liu, and J. C. Light, J. Chem. Phys. 110, 168 (1999). [105] C. C. Paige, J. Inst. Math. Appl. 10, 373 (1972). [106] B. N. Parlett and D. S. Scott, Math. Comp. 33, 217 (1979). [107] H. D. Simon, Ph.D. thesis, University of California, Berkeley (1982). [108] H. D. Simon, Math. Comp. 42, 115 (1984). [109] K. Wu and H. D. Simon, Tech. Rep. 41412, Lawrence Berkeley National Laboratory (1998). [110] K. Wu and H. D. Simon, Tech. Rep. 41284, Lawrence Berkeley National Laboratory (1997). [111] A. Ruhe, Math. Comput. 33, 680 (1979). 105

[112] X.-G. Wang and T. Carrington, Jr., J. Chem. Phys 114, 1473 (2001). [113] R. Chen and H. Guo, J. Chem. Phys. 114, 1467 (2001). [114] C. Bischof, S. Huss-Lederman, X. Sun, and A. Tsao, in Scalable Parallel Libraries Conference (IEEE Computer Society, Washington, DC, 1994), pp. 123 131. [115] L. Auslander and A. Tsao, Adv. Appl. Math. 13, 253 (1992). [116] J. H. Wilkinson, The Algebraic Eigenvalue Problem (Oxford University, London, 1965). [117] M. P. Sears, (private communication). [118] R. G. Littlejohn, Phys. Rep. 138, 193 (1986). [119] R. Simon, E. C. G. Sudarshan, and N. Mukunda, Phys. Rev. A 36, 3868 (1987). [120] I. Gradshteyn and I. Ryzhik, eds., Table of Integrals, Series, and Products (Academic Press, San Diego, 2000). [121] J. W. Demmel, Applied Numerical Linear Algebra (SIAM, Philadelphia, 1997).

106

APPENDIX A JUSTIFICATION OF SPHERICAL TRUNCATION CONDITION To justify the rotational symmetry implicit in Eqs. (2.33) and (2.34), only the unsymmetrized Gaussian representation will be addressed (for simplicity); however, from Eq. (2.10), it is clear that the conclusions drawn here also apply to the symmetrized case. The doubly dense unsymmetrized 3 DOF Gaussians are given by guv (x) = a2
3/4

u a2 i uv i avx 2 x a 2

(A.1)

from which the potential matrix elements are found to be V


g u,v,u ,v

a2

3/2

i (uvu v ) (u )2 2 4

V (x)e

2 i axv a x

u+ 2 a

dx. (A.2)

By taking the absolute value of the Eq. (A.2) integrand, one obtains an upper limit on the absolute value of the integral: V
g u,v,u ,v

a2

3/2

(u )2 4

V (x)e

a2 x

u+ 2 a

dx.

(A.3)

Since the potential V (x) = V (x x) is rotationally symmetric, it is obvious that Eq. (A.3) is invariant with respect to any rotation of the vectors u or u+ ; thus, a spherical truncation is applicable for these parameters. For v , the rotational invariance can be deduced from the momentum space representation, in which the 3 DOF Gaussians are found to be guv (p) =
2 1 1 ei 2 uv ei a up e 2a2 (pv a) . (a2 )3/4

(A.4)

The potential energy operator is represented as a function of the momentum Lapla cian, i.e. V = V (
2 p 2 p ).

For simplicity, we consider only the rst-order

2 p

term,

although similar conclusions can be drawn for all other orders as well. The upper limit of the matrix element integral is found to be
(2) 2 g p )|u v
1 2 e 4 (v ) 2 )3/2 (a

guv |( (2)

1 a2

av+ 2

107

(p v a ), u where [p , u ] =

dp,
2

(A.5) + (u p )2 . (A.6) a6

1 3 (p )2 2 2 (u )2 4 a a a

Since is invariant with respect to simultaneous rotation of all vectors, the same must be true of the integrand in Eq. (A.5), and also the entire right hand side. Thus, spherical truncation in v may also be applied.

108

APPENDIX B EIGENFUNCTIONS OF HARMONIC OSCILLATOR PSRO This section proves that the PSRO QC associated with 2f -dimensional hyperK spheres centered at the origin have the same eigenfunctions as the quantum HO Hamiltonian. The rst part of the proof describes a specic symplectic matrix and its corresponding operator formalism known as a metaplectic operator. Next, it is shown that the PSRO and its Wigner-Weyl (WW) phase space representation QC are K invariant under this metaplectic/symplectic transformation which ultimately leads to the conclusion of the eigenfunctions of the PSRO. Consider a subgroup of SO(2f ) that consists of the set of 2f -dimensional rotations R() where = (1 , 2 , . . . , f ) that all have a block diagonal matrix form, i.e.,

R() = R(1 ) R(2 ) . . . R(f )


=

R(1 ) R(2 ) ... 0

R(f )

(B.1)

where

R(i ) =

cos(i )

sin(i )

sin(i ) cos(i )

and

i [0, 2).

If we use the subgroup to act on 2f -dimensional phase space, i.e.,

R()z = z

(B.2)

109

z=

, qf

q1

and

p1 . . .

pf

then R() can be thought of as a composition of f 2-dimensional counterclockwise rotations about the origin each rotating a pair of phase space axes (qj , pj ) by j where j = 1, . . . , f . One important property of these rotation matrices is that they are symplectic, i.e., R() Sp (2f, R), since they satisfy the equation:

R()

R() =
0 1 0

(B.3)

0 ... 0
1

where

1 0

and R()

is the transpose of R(). Symplectic matrices are known in classical

mechanics as transformations of canonical coordinates that leave the Poisson bracket invariant. This property carries over to quantum mechanics where the commutator of the operators corresponding to the canonical coordinates, i.e,

[(z )i , (z )j ]Q = i

ij

(B.4)

110

z=

, qf

q1 p1 . . .

where

pf

are invariant under these transformations. Thus, one can write

[(z )i , (z )j ]Q = [(z )i , (z )j ]Q

(B.5)

where

(z )i =

R()

ij

(z )j

and repeated indices are implied to be summed. For every symplectic matrix there corresponds a unitary operator which in our case will be denoted as U [R()] parameterized by R() where U [R()] (z )i U [R()]1 = R()
ij

(z )j .

(B.6)

These unitary operators are known as metaplectic operators and are thoroughly re viewed in Ref. [118] . Based on the denition of R() in (B.1), the corresponding metaplectic operator is

i q2 p2 U [R()] = e 2 ( + )

q p q p q p = e 2 1 (1 +1 ) e 2 2 (2 +2 ) . . . e 2 f (f +f ) .

(B.7)

The last expression can be veried by plugging it into (B.6) and using the BakerHausdor lemma. For example,

111

i i q 2 p2 q 2 p2 U [R()] qj U [R()]1 = e 2 j (j +j ) qj e 2 j (j +j )

= qj +

(ij /2) 2 [j + p2 , qj ]Q q j 1!

(ij /2)2 2 [j + p2 , [j + p2 , qj ]Q ]Q + . . . q j q 2 j 2! (B.8)

= qj cos(j ) + pj sin(j )

which is in agreement with the right hand side of (B.6). One can also substitute qj with pj and get U [R()] pj U [R()]1 = j sin(j ) + pj cos(j ) q also satisfying the symplectic/metaplectic relationship. Let us consider the case where f = 1 and the rotation parameter for the single DOF is very small, i.e., || as

(B.9)

1. The rotation matrix can be written approximately

cos() sin() R() = sin() cos() = eiJ

1 + iJ where

(B.10)

0 i J = i 0

is the generator of the SO(2) group. The corresponding innitesimal metaplectic operator is
i q 2 p2 U [R()] = e 2 ( + )

112

i 1 + (2 + p2 ) q 2 which approximately transforms a 1 DOF PSRO by the equation i U [R()] QC (, p) U [R()]1 QC (, p) + [2 + p2 , QC (, p)]Q . K q K q q K q 2

(B.11)

(B.12)

Our goal is to show how the WW representation of the corresponding PSRO, QC (q, p), transforms via (B.12). Lets rst look at the WW transform in two forms K relating the PSRO and its WW representation:

QC (q, p) = K

1 2 1 2

1 1 q q | QC (, p) |q + q eiq p dq K q 2 2 1 1 p p | QC (, p) |p + p eip q dp K q 2 2

(B.13)

QC (q, p) = K

(B.14)

where the matrix elements inside the integrals are conguration kernels dened in Eq. (3.10). Using (B.13) and (B.14), one can easily nd the WW representations p K q corresponding to [2 , QC (, p)]Q and [2 , QC (, p)]Q , respectively: q K q [2 , QC (, p)]Q 2iq q K q QC (q, p) p K QC (q, p) . q K (B.15) (B.16)

[2 , QC (, p)]Q 2ip p K q Thus, the righthand side of Eq. (B.12)

i QC (, p) + [2 + p2 , QC (, p)]Q QC (q, p) + p QC (q, p) q QC (q, p) K q q K q K K 2 q p K QC (q + p, p q) K = QC ( 1 + iJ K QC ( R() K


q .
1j 1j

zj , 1 + iJ
2j

2j

zj ) (B.17)

zj , R()

zj )

where

113

In the 1 DOF case, we nd that the innitesimal metaplectic transformation of the PSRO is equivalent to the innitesimal symplectic transformation of the coordinates of the corresponding WW representation. If we now consider the rotation parameter to be nite, then applying N innitesimal rotations repeatedly each with parameter /N (where N is large) is the same as applying the actual metaplectic operator U [R()] and symplectic matrix R() since U [R()] = lim
i

N
2

i 2 1+ ( + p2 ) q 2N
2

q q = e 2 ( + )

(B.18) J N
N

R() =

lim

1+i

= eiJ .

(B.19)

Thus, our argument can be extended to transformations that are not innitesimal. Also, it is obvious that the relationship is valid when carried over to the f DOF case dealing with the rotations of (B.1) and (B.7); therefore, we can conclude that U [R()] QC ( , p) U [R()]1 K q QC K R()
1j

(z)j , R()

2j

(z)j , . . . , R()

(2f )j

(z)j

(B.20)

In general, this mapping is true for any symplectic/metaplectic pair, and a formal proof of this can be seen in Ref. [119] . For the HO case, QC (q, p) is a 2f -dimensional hypersphere centered at the origin K which is invariant under rotations along the center, i.e., QC K R()
1j

(z)j , R()

2j

(z)j , . . . , R()

(2f )j

(z)j

= QC (q, p) . (B.21) K

Since the WW transform is isomorphic, one can deduce from (B.20) and (B.21) that U [R()] QC ( , p) U [R()]1 = QC ( , p) ; K q K q (B.22)

thus, U [R()] and QC ( , p) share the same eigenfunctions. From (B.7), we see K q that the eigenfunctions of U [R()] are the HO eigenfunctions |n ; therefore, the eigenfunctions of QC ( , p) are |n , as well. K q 114

APPENDIX C EIGENVALUES OF HARMONIC OSCILLATOR PSRO This section proves the relationship wn (Emax ) = wnS (Emax ) where nS =
f j=1

nj

and provides an analytical expression of the eigenvalue. The following derivation is very similar to the proof presented in Ref. [69]; although, there are slight deviations. First, we want to nd an analytical expression for the single DOF eigenvalue
(1) wn (Emax ) which will be useful later in the proof. We start by plugging Eqs. (3.21)

and (3.22) into (3.20) for the 1 DOF case, i.e.,

(1) wn (Emax ) =

Wn (q, p)dqdp
R

(1)n

Ln [2(q 2 + p2 )]e(q

2 +p2 )

dqdp

(C.1)

where R is a disk of radius 2Emax . The next logical step is to introduce polar coordinates r = q 2 + p2 and such that the region of integration or disk R = {(r, ) | 0 r 2Emax , 0 2}; thus,
(1) wn (Emax )

(1)n =

2 0 0

2Emax

Ln (2r2 )er rdrd.

(C.2)

After integrating out the angle and substituting 2r2 with the variable t, Eq. (C.2) simplies to
(1) wn (Emax ) =

(1)n 2

4Emax 0

Ln (t)et/2 dt.

(C.3)

From Ref. [120] , there is a useful Laguerre polynomial identity: Ln (t) = d [Ln (t) Ln+1 (t)]. dt (C.4)

After plugging this equation into (C.3) and integrating by parts, we arrive at the expression

115

(1) wn (Emax ) =

(1)n 2Emax 1 (1)n e [Ln (4Emax ) Ln+1 (4Emax )] + 2 2 2


4Emax 0

Ln (t)et/2 dt +

(1)n+1 2

4Emax 0

Ln+1 (t)et/2 dt

(1)n 2Emax e [Ln (4Emax ) Ln+1 (4Emax )] 2 + 1 (1) (1) wn (Emax ) + wn+1 (Emax ) 2 (C.5)

which simplies into the recurrence relationship


(1) wn+1 (Emax ) = wn (Emax ) + (1)n+1 e2Emax [Ln (4Emax ) Ln+1 (4Emax )]. (1) (1)

(C.6)

Given that w0 = 1 e2Emax [look at (C.3) where L0 (t) = 1], the closed form of the last equation is
(1) wn (Emax ) n1 k=0

= 1 2e

2Emax

(1)k Lk (4Emax ) +

(1)n Ln (4Emax ) 2

(C.7)

for n > 0. Lets now go back to the f DOF case where (C.1) is now

wn (Emax ) = =

Wn (q, p)df qdf p


f

(1)nj 2 2 2 Lnj [2(qj + p2 )]e(qj +pj ) dqj dpj . j R j=1

(C.8)

We can use polar coordinates r = (r1 , . . . , rf ) and = (1 , . . . , f ) where qj = rj cosj and pj = rj sinj . The region of integration or hypersphere R = {(r, ) | 0 r1 2 2 2 2 2Emax , 0 r2 2Emax r1 , . . . , 0 rf 2Emax r1 r2 rf 1 , 0 1 2, 0 2 2, . . . , 0 f 2}; thus, (C.8) can be written as

116

wn (Emax ) = 2 (1)
0

n1 ++nf 0

2Emax

2 2 Ln1 (2r1 )er1 r1 2 rf

2 2Emax r1

2 Ln2 (2r2 )er2 r2

2 2 2Emax r1 rf 1

2 Lnf (2rf )e

rf drf dr2 dr1

(C.9)

where the all of the angles have been integrated to give (2)f . In order to solve for an analytical representation of this integral, we will rst look at the rf 1 and rf contributions, the two innermost integrals, i.e., 2 (1)
0 2 nf 1 +nf 0
2 2 2Emax r1 rf 1

2 2 2Emax r1 rf 2

2 Lnf 1 (2rf 1 )erf 1 rf 1


2

2 Lnf (2rf )erf rf drf drf 1 .

(C.10)

From Eq. (C.2), we notice that we have a 1 DOF eigenvalue for the rf contribution: 2(1)
nf 1 0 (1) wnf [Emax

2 2 2Emax r1 rf 2

2 Lnf 1 (2rf 1 )erf 1 rf 1

2 2 (1/2)(r1 rf 1 )] drf 1 .

(C.11)

2 Replacing the eigenvalue with (C.7) and substituting 2rf 1 with the variable u, the

previous expression becomes

(1)nf 1 2
nf 1

2 2 2(2Emax r1 rf 2 )

Lnf 1 (u)eu/2 1 2e(2Emax r1 rf 2 u/2)

2 2 (1)k Lk [2(2Emax r1 rf 2 ) u]

(1) 2 2 Lnf [2(2Emax r1 rf 2 ) u] 2

k=0 nf

du.

(C.12)

117

Noticing that the rst term is also a 1 DOF eigenvalue and using the Laguerre identity 120
t 0

Lm (u)Ln (t u)du = Lm+n (t) Lm+n+1 (t),

(C.13)

we get the expression

nf 1 (1) wnf 1 (Emax )

(1)

nf 1

2Emax k=0 nf 1 +nf

(1)k Lnf 1 +k (4Emax ) e2Emax Lnf 1 +nf (4Emax ) (C.14)

Lnf 1 +k+1 (4Emax ) Lnf 1 +nf +1 (4Emax )

(1)

2 2 where Emax = Emax (1/2)(r1 rf 2 ) . Replacing the rst term with (C.7) and

then combining summations, one can recognize, using (C.7) in reverse, that (C.14) simplies to 1 (1) 2 2 Emax (1/2)(r1 rf 2 ) w 2 nf 1 +nf
2 2 +wnf 1 +nf +1 Emax (1/2)(r1 rf 2 ) (1)

(C.15)

Finally, plugging (C.15) into our original expression (C.9), we get

wn (Emax ) = 2

f 2

(1)

n1 ++nf 2 0

2Emax

2 2Emax r1

2 Ln1 (2r1 )er1 r1

0
2 rf 2

2 2 Ln2 (2r2 )er2 r2

2 2 2Emax r1 rf 3

2 Lnf 2 (2rf 2 )

1 (1) 2 2 e rf 2 wnf 1 +nf Emax (1/2)(r1 rf 2 ) 2 1 2 (1) 2 +wnf 1 +nf +1 Emax (r1 rf 2 ) drf 2 dr2 dr1 . (C.16) 2

118

Using steps (C.11)-(C.15) for each eigenvalue term, we can eliminate the innermost integral (rf 2 contribution) to give the simplied expression 1 (1) (1) w (E ) + 2wnf 2 +nf 1 +nf +1 (Emax ) 22 nf 2 +nf 1 +nf max +wnf 2 +nf 1 +nf +2 (Emax )
(1)

(C.17)

2 2 where Emax = Emax (1/2)(r1 rf 3 ). If one repeats this process in order to

eliminate all of the integrals in Eq. (C.9), the nal equation becomes wn (Emax ) = 1 2f 1
f 1 k=0

f 1 (1) wnS +k (Emax ) . k

(C.18)

Note that wn (Emax ) depends on nS ; thus, we can write wn (Emax ) = wnS (Emax ).

119

APPENDIX D JUSTIFICATION OF PREPROCESSING AND SUBSPACE ITERATION METHOD The symmetric and real matrix, H , can be diagonalized via some orthogonal similarity transformation, i.e., H = F F T , (D.1)

where the columns of F are the orthonormal eigenvectors of H , and is a diagonal matrix containing the corresponding eigenvalues. To simplify the proof, we arrange the columns of F such that the eigenvalues in are listed in descending order, starting from the top-left corner (H is similarly reordered). Thus, can be rewritten as

u l

(D.2)

where u is the d d diagonal matrix that contains the eigenvalues ranging from 1 to 2 and l is the N d N d diagonal matrix with the remaining eigenvalues ranging from 0 to 1. Since F T F = I, then

(H )r Z0 = F r F T Z0 ( )r u = F

(l )r

T F Z0 .

(D.3)

We can rewrite Eq. (D.3) as

(H )r Z0 = F

V dd W (N d)d

(D.4)

120

where each row of V dd and W (N d)d is a multiple of a diagonal element of (u )r and (l )r , respectively. We can also write F in a split form, i.e., F = N N (N d) Fu d Fl (D.5)

N N (N d) correspond to the eigenvalues where the column eigenvectors of Fu d and Fl in u and l , respectively. With Eqs. (D.4) and (D.5), a useful expression can be realized: N N (N d) W (N d)d . (H )r Z0 = Fu d V dd + Fl Since the elements in (l )r approach 0 as r , then N (H )r Z0 Fu d V dd (D.7) (D.6)

as the iteration number r increases. Since V dd has full rank, then the space spanned N N by the column vectors of Fu d V dd is the same as that spanned by Fu d . Thus, after many iterations, one approaches an ISUB spanned by the eigenvectors that correspond to eigenvalues of H that range between 1 and 2. These eigenvectors are exactly the same as those of the original matrix H with eigenvalues less than . This derivation is based upon ideas obtained from Ref. [121] .

121

Vous aimerez peut-être aussi