Molecular wavefunctions in critic2

Introduction

Molecular wavefunctions and structures can be loaded in critic2 using the MOLECULE and LOAD keywords. There is no standard file format for the description of a molecular wavefunction and, even worse, wavefunction files whose format is supposedly the same are written by different programs in subtly different ways, leading to the erroneous calculation of molecular properties, and a great deal of confusion. The hope is that these notes serve to clarify some of these issues and also report how critic2 understands the wavefunction files written by various programs.

Relevant Equations for Gaussian Functions

Molecular spinorbitals (MO) are written as linear combination of atomic orbitals (AO): \begin{equation} \psi^\sigma_i({\bf r}) = \sum_j c^\sigma_j \chi_j({\bf r}) \end{equation} Each MO has an associated occupation \(f_i^\sigma\), such that the electron density (for instance) is: \begin{equation} \rho({\bf r}) = \sum_\sigma \sum_i^{\rm occ} f_i^\sigma \lvert\psi_i^\sigma({\bf r})\rvert^2 \end{equation} The values of \(f_i^\sigma\) go from zero to one but, in the case of a restricted or restricted-open wavefunction, we can have \(f_i^\sigma = 2\) and the sum goes over spatial orbitals instead. Wavefunctions written in terms of natural orbitals (for instance, generated by a correlated wavefunction method) have fractional occupations. The molecular orbitals are normalized in such a way that the electron density integrates to the number of electrons.

The most natural way of reporting the molecular orbitals \(c_j\) in wavefunction files is to have them combine with AOs that are normalized, such that the overlap matrix has a diagonal of ones. The AOs themselves are written as a linear combination of primitives, which are Gaussian-type functions (GTFs): \begin{equation} \chi_j({\bf r}) = N_{\rm AO}({\bf d},{\bf \alpha},l_x,l_y,l_z) \sum_k d_k^j g(\alpha_k,l_x,l_y,l_z;{\bf r}) \end{equation} where: \begin{equation} g(\alpha_k,l_x,l_y,l_z;{\bf r}) = N_{\rm GTF}(l_x,l_y,l_z,\alpha_k) x^{l_x}y^{l_y}z^{l_z} e^{-\alpha_k r^2} \end{equation} The \(l_x\), \(l_y\), and \(l_z\) quantities are integers greater than or equal to zero, representing the angular momentum quantum number of the GTF. All the primitives that make up one AO share the same angular momentum numbers, but have different exponents. The \(N_{\rm GTF}\) factor in front is the normalization factor of the GTF, and it depends on the angular momentum numbers and on the exponent.

We now calculate the normalization constant for a single GTF. The GTF can be written as the product of three one-dimensional Gaussian functions: \begin{equation} g(\alpha_k,l_x,l_y,l_z;{\bf r}) = g(\alpha_k,l_x;x)\times g(\alpha_k,l_y;y)\times g(\alpha_k,l_z;z) \end{equation} where: \begin{equation} g(\alpha,l_x;x) = N_1(\alpha,l_x) x^{l_x} e^{-\alpha x^2} \end{equation} The normalization factor can be derived easily using the integral: \begin{equation} \int_{-\infty}^{\infty} x^{2k} e^{-\alpha x^2} dx = \frac{(2k-1)!!}{(2\alpha)^k}\sqrt{\frac{\pi}{\alpha}} \end{equation} from where we have: \begin{equation} \langle g(\alpha_i,l_x;x) | g(\alpha_i,l_x;x) \rangle = N_1(\alpha,l_x)^2 \frac{(2l_x-1)!!}{(4\alpha)^{l_x}} \sqrt{\frac{\pi}{2\alpha}} \end{equation} \begin{equation} \langle g(\alpha_i,l_x;x) | g(\alpha_j,l_x;x) \rangle = N_1(\alpha_i,l_x)N_1(\alpha_j,l_x) \frac{(2l_x-1)!!}{(2(\alpha_i+\alpha_j))^{l_x}} \sqrt{\frac{\pi}{\alpha_i+\alpha_j}} \end{equation} Making the first expression equal to one, we have the normalization factor: \begin{equation} N_1(\alpha,l_x) = \frac{2^{l_x+1/4} \alpha^{l_x/2+1/4}}{\pi^{1/4} \sqrt{(2l_x-1)!!}} \end{equation} and the overlap between two Gaussian functions with different exponent but same angular momentum is: \begin{equation} \langle g(\alpha_i,l_x;x) | g(\alpha_j,l_x;x) \rangle = 2^{l_x+1/2} \frac{(\alpha_i\alpha_j)^{l_x/2+1/4}}{(\alpha_i+\alpha_j)^{l_x+1/2}} \end{equation}

For the three-dimensional GTFs, we have that the normalization factor is the product of the three one-dimensional \(N_{\rm GTF}\): \begin{equation} N_{\rm GTF}(l_x,l_y,l_z,\alpha_k) = \frac{2^{l+3/4} \alpha^{l/2+3/4}}{\pi^{3/4} \sqrt{(2l_x-1)!!(2l_y-1)!!(2l_z-1)!!}} \end{equation} where \(l = l_x + l_y + l_z\). The overlap between two GTFs with different exponents but same \(l\) and center is: \begin{equation} \langle \alpha_i | \alpha_j \rangle = 2^{l+3/2} \frac{(\alpha_i\alpha_j)^{l/2+3/4}}{(\alpha_i+\alpha_j)^{l+3/2}} \end{equation} With this result, we can now find the normalization factor for a single atomic orbital: \begin{equation} \langle\chi|\chi\rangle = N_{\rm AO}({\bf d},{\bf \alpha},l_x,l_y,l_z)^2 2^{l+3/2} \sum_{ij} d_i d_j \frac{(\alpha_i\alpha_j)^{l/2+3/4}}{(\alpha_i+\alpha_j)^{l+3/2}} = 1 \end{equation} Solving for the normalization constant: \begin{equation} N_{\rm AO}({\bf d},{\bf \alpha},l_x,l_y,l_z) = \left[2^{l+3/2} \sum_{ij} d_i d_j \frac{(\alpha_i\alpha_j)^{l/2+3/4}}{(\alpha_i+\alpha_j)^{l+3/2}} \right]^{-1/2} \end{equation} Alternatively, we can insert the normalization constants of both GTFs in the sum to give: \begin{equation} N_{\rm AO}({\bf d},{\bf \alpha},l_x,l_y,l_z) = \left[ \frac{(2l_x-1)!!(2l_y-1)!!(2l_z-1)!! \pi^{3/2}}{2^l} \sum_{ij} \frac{d_i d_j N_{\rm GTF}^i N_{\rm GTF}^j}{(\alpha_i+\alpha_j)^{l+3/2}} \right]^{-1/2} \end{equation}

Wavefunction Data Structure in Critic2

For the purpose of calculating the electron density and other fields at given points in space, critic2 does not need the atomic orbital information. Instead, the MO coefficients are multiplied by the contraction coefficients and a list of coefficient/exponent pairs is generated. Specifically, critic2 writes a given molecular orbital as a sum over primitive GTFs: \begin{equation} \psi^\sigma_i({\bf r}) = \sum_k C_{ki}^\sigma x^{l^k_x}y^{l^k_y}z^{l^k_z} e^{-\alpha_k r^2} \end{equation} where the \(C_{ki}^\sigma\) are a combination of the MO coefficients \(c\), the contraction coefficients \(d\) and the normalization constants: \begin{equation} C^\sigma_{ki} = c^\sigma_{ki} \times d_k \times N_{\rm AO}({\bf d},{\bf \alpha},l^k_x,l^k_y,l^k_z) \times N_{\rm GTF}(l^k_x,l^k_y,l^k_z,\alpha_k) \end{equation} Note that the same \(c^\sigma_{ki}\) is repeated in different \(k\)s, since the same MO coefficient multiplies all primitive functions in the same AO. Also, note that the normalization constant for the AO depends on the contraction coefficients and exponents of the whole shell, not just primitive \(k\).

In practice, the following is the basic information critic2 uses to calculate the molecular scalar fields:

  • Atomic identifier for the primitives’ centers (icenter).

  • The “type” of primitive, a combined integer index that represent the three values \(l_x\), \(l_y\), and \(l_z\) (itype).

  • Primitive exponents (e).

  • Molecular orbital occupations (occ).

  • The \(C^\sigma_{ki}\) coefficients (cmo).

A lot of additional information like the spins, wavefunction type, etc. is also written down by critic2, but these are the basic variables of the wavefunction type.

Order of Primitive Types Within the Same Shell

Wavefunction file formats that give the AO exponents, AO coefficients, and the MO coefficients do not strictly contain all the information necessary to reproduce the wavefunction. The reason is that the AO exponents and coefficients are given per shell, but there is no information regarding how the AOs within that shell are ordered.

For example, the hydrogen atom in the cc-pVTZ basis set has a d shell with a single primitive. This primitive’s exponent is 1.057 and the coefficient is 1.0. In a given calculation using Cartesian primitives, there are six MO coefficients associated with this shell, corresponding to the AOs with angular parts xx, xy, xz, yy, yz, and zz. To which of this six angular parts do those six coefficients correspond? The wavefunction file gives no indication how this question is answered, and different programs have different orders for their primitives. If the source of the wavefunction file is a closed-source program, this order has to be deduced using model molecules made specifically for this purpose.

The Cartesian primitive order in critic2 is:

  • p shell: x, y, z.

  • d shell: xx, yy, zz, xy, xz, yz.

  • f shell: xxx, yyy, zzz, xyy, xxy, xxz, xzz, yzz, yyz, xyz.

  • g shell: xxxx, yyyy, zzzz, xxxy, xxxz, xyyy, yyyz, xzzz, yzzz, xxyy, xxzz, yyzz, xxyz, xyyz, xyzz.

Most programs do not follow this order. Translating the MO coefficients from the source order to the critic2 order is done using a permutation vector, typtrans. This vector transforms the integer type of the original primitive into the value it has in critic2.

Spherical and Cartesian primitives

The primitive GTFs used in the description of the MOs can also be given as spherical GTFs: \begin{equation} g(\alpha,l,m;{\bf r}) = S_{lm}({\bf r}) e^{-\alpha_k r^2} \end{equation} where \(S_{lm}\) is a real solid harmonic function: \begin{equation} S_{lm} = \sqrt{2} {\rm Im}\left[ r^l Y_l^{|m|}\right] \quad {\rm if\ } m < 0
\end{equation} \begin{equation} S_{lm} = Y_l^0 \quad {\rm if\ } m = 0
\end{equation} \begin{equation} S_{lm} = \sqrt{2} {\rm Re}\left[ r^l Y_l^m\right] \quad {\rm if\ } m > 0
\end{equation} where the \(Y_l^m\) are spherical harmonics.

The transformation between spherical and Cartesian primitives can be carried out by replacing the \(2l+1\) MO coefficients corresponding to the spherical primitives with the \({\binom {3+l-1} l}\) MO coefficients for the Cartesian primitives. Since the \(S_{lm}\) functions can be written as a linear combination of Cartesian monomials of degree \(l\), the transformation between spherical and Cartesian MO coefficients involves a simple multiplication of the coefficients for a given shell, arranged in vector form, with the transformation matrix between spherical and Cartesian GTFs. This transformation matrix is given in Schlegel and Frisch, Int. J. Quantum Chem. 54 (1995) 83. In particular, the coefficients can be extracted from Table 1 (note the table gives \(r^l Y_l^m\) instead of \(S_{lm}\)). They can also be calculated numerically or analytically from equation 15. The numerical calculation of these coefficients has been implemented in this script.

SP shells

SP-shells are s and p shells that share the same number of primitives and contraction exponents but have different coefficients. They are used in Pople’s basis sets (e.g. 6-31G*). Some wavefunction file formats can give a shell type as SP. When that happens, two sets of contraction coefficients are given that correspond to the s and p shells. If an sp shell is found, critic2 simply unfolds it into the corresponding s and p shells, which are then independent of each other. This is slightly inefficient, but it is by far the simplest solution.

Wavefunction File Formats in Critic2

Current Implementation Status

Program Format Works with at least these versions…
Gaussian wfn g03(E.01), g09(A.02,B.01,C.01,D.01,E.01), g16(A.03,B.01,C.01)
  wfx g09(B.01,C.01,D.01,E.01), g16(A.03,B.01,C.01)
  fchk g03(E.01), g09(A.02,B.01,C.01,D.01,E.01), g16(A.03,B.01,C.01)
psi4 molden 1.1, 1.4
orca wfn 4.0
  wfx 4.2
  molden 4.0, 4.2

The limitations for each of these formats are listed below, along with notes on their particularities. Primitives with angular momentum of h and above are not supported at present, but can be implemented if necessary. Other programs may work with critic2 if they write these formats, but you should test that the results are correct (e.g. by integrating the molecular charge using MOLCALC "$1").

Gaussian Wfn and Wfx File Formats (wfn, wfx)

The Gaussian wfn and wfx file formats contain exactly the information described above, so reading the wavefunction information from them is straightforward (versioning and formatting problems notwithstanding).

Orca Wfn File

The structure of these files seems to be the same as in Gaussian, but early versions of orca are a little finnicky. In particular, version 4.0.2 has the following issues:

  • There can be lines such as CENTRE ASSIGNMENT that have no information associated with them, and this can throw off the parser in critic2.

  • g orbitals are not written to orca wfn files.

  • Unrestricted wavefunctions are not written.

Some or all of these issues seem to have been fixed in version 4.2.

Orca Wfx File

These are written by orca 4.2 but not by orca 4.0.2. They seem to follow the same format as Gaussian and no problems reading these files have been found. Orca seems to refuse writing a wfx file if the calculation contains ECPs.

Gaussian Formatted Checkpoint File (fchk)

  • The primitive order is:

    • p shell: x, y, z.

    • d shell: xx, yy, zz, xy, xz, yz.

    • f shell: xxx, yyy, zzz, xyy, xxy, xxz, xzz, yzz, yyz, xyz.

    • g shell: zzzz, yzzz, yyzz, yyyz, yyyy, xzzz, xyzz, xyyz, xyyy, xxzz, xxyz, xxyy, xxxz, xxxy, xxxx.

  • The fchk file may contain SP shells. These are indicated as shell type -1.

  • Spherical and Cartesian shells are possible in a fchk file. Cartesian shells are indicated by a positive type (e.g. +2 is a Cartesian d shell) and spherical shells by a negative type (-2 is a spherical d shell).

  • The contraction coefficients are not multiplied by any normalization constant.

Molden files (molden)

The molden website, where the specifcation of this format was given, seems to be down at present. The following lists what is implemented in critic2 at the moment, and I have no way of checking whether these follow the “official” molden format or not. But each program seems to have its own individual opinion on how to write these files anyway. In the following, the particular version of the molden file written by a program is called its “dialect”.

  • If there is no indication from the user regarding which dialect the molden file uses, critic2 tries to guess the dialect. If the guess fails, critic2 uses the psi4 dialect by default.

  • The primitive order is:

    • p shell: x, y, z.

    • d shell: xx, yy, zz, xy, xz, yz.

    • f shell: xxx, yyy, zzz, xyy, xxy, xxz, xzz, yzz, yyz, xyz.

    • g shell: xxxx, yyyy, zzzz, xxxy, xxxz, xyyy, yyyz, xzzz, yzzz, xxyy, xxzz, yyzz, xxyz, xyyz, xyzz.

  • Spherical primitives are possible in a molden file. They are indicated by the “5d”, “7f”, “5d7f”, “5d10f”, and “9g” keywords, which is somewhat confusing. If none of these keywords are present, all primitives are assumed Cartesian.

    • “5d” and “5d7f” are equivalent, and they indicate that d and f shells are spherical.

    • “5d10f”: the d shells are spherical but the f shells are Cartesian.

    • “7f”: the f shells are spherical but the d shells are Cartesian.

    • “9g”: the g shells are spherical.

  • ROHF and fractional-occupation wavefunctions are currently not supported by critic2 with molden files.

  • SP shells are not implemented.

Psi4 Molden File

  • Psi4 does not write SP shells. The SP shells are already unpacked into the corresponding s and p shells in the molden file.

  • The contraction coefficients are not multiplied by any normalization constant.

  • The molecular coordinates in the molden file are different from the input file (the molecule is, at least, translated).

  • Either due to a bug in critic2, in psi4, or a misunderstanding regarding the molden format, psi4 molden files with Cartesian primitives do not to work properly in critic2.

  • Psi4 does not write h primitives to the molden file (but the file is still generated!).

Orca Molden File

  • Critic2 will use the orca molden dialect if no indication is given by the user and the words “created by orca_2mkl” appear under the [TITLE] tag.

  • Orca does not write SP shells. The SP shells are already unpacked into the corresponding s and p shells in the molden file.

  • Orca writes spherical primitives only. The “5d”, etc. keywords are ignored in an orca molden file.

  • The MO coefficients corresponding to shells with angular momentum f and above have flipped sign.

  • The contraction coefficients are multiplied by a constant that is different from the normalization constant of a GTF. In particular, if the shell has angular momentum \(l\), all contraction coefficients in the molden file have been multiplied by: \begin{equation} N(l,\alpha_k) = \frac{2^{l+3/4} \alpha_k^{l/2+3/4}}{\pi^{3/4} \sqrt{(2l-1)!!}} \end{equation} where \(\alpha_k\) are their respective exponents. This is the normalization factor for the AOs with angular terms \(x^l\), \(y^l\), and \(z^l\) within that shell.

ADF Molden File (with STOs)

  • Critic2 will use the ADF molden dialect if the [STO] tag appears in it. Critic2 can handle the STO wavefunction and calculate all the usual properties. Naturally, the equations above do not apply.

  • Since the atomic orbitals are simple exponentials, there are no contractions and the angular momentum of each AO is indicated explicitly. Therefore, reading this input file is a lot more straightforward than the GTF counterparts.