^{1}, Jason D. Goodpaster

^{1}, Frederick R. Manby

^{2}and Thomas F. Miller III

^{1,a)}

### Abstract

Density functional theory (DFT) provides a formally exact framework for performing embedded subsystem electronic structure calculations, including DFT-in-DFT and wavefunction theory-in-DFT descriptions. In the interest of efficiency, it is desirable to truncate the atomic orbital basis set in which the subsystem calculation is performed, thus avoiding high-order scaling with respect to the size of the MO virtual space. In this study, we extend a recently introduced projection-based embedding method [F. R. Manby, M. Stella, J. D. Goodpaster, and T. F. Miller III, J. Chem. Theory Comput.8, 2564 (Year: 2012)]10.1021/ct300544e to allow for the systematic and accurate truncation of the embedded subsystem basis set. The approach is applied to both covalently and non-covalently bound test cases, including water clusters and polypeptide chains, and it is demonstrated that errors associated with basis set truncation are controllable to well within chemical accuracy. Furthermore, we show that this approach allows for switching between accurate projection-based embedding and DFT embedding with approximate kinetic energy (KE) functionals; in this sense, the approach provides a means of systematically improving upon the use of approximate KE functionals in DFT embedding.

This work is supported by the U. S. Army Research Laboratory and the U. S. Army Research Office (USARO) under Grant No. W911NF-10-1-0202, by the Air Force Office of Scientific Research (USAFOSR) under Grant No. FA9550-11-1-0288, and by the Office of Naval Research (ONR) under Grant No. N00014-10-1-0884. T.A.B. acknowledges support from a National Defense Science and Engineering Graduate Fellowship, and T.F.M. acknowledges support from a Camille and Henry Dreyfus Foundation New Faculty Award and an Alfred P. Sloan Foundation Research Fellowship.

I. INTRODUCTION

II. PROJECTION-BASED EMBEDDING

III. AO BASIS SET TRUNCATION

A. The challenges of AO basis set truncation

B. An improved AO basis set truncation algorithm

C. Switching between orbital projection and approximation of the non-additive kinetic potential (NAKP)

IV. APPLICATIONS

A. WFT-in-HF truncated embedding for polypeptides

B. Embedded MBE

1. Water hexamers

2. Polypeptides

V. CONCLUSIONS

### Key Topics

- Molecular beam epitaxy
- 19.0
- Density functional theory
- 13.0
- Basis sets
- 11.0
- Chemical bonds
- 6.0
- Polymers
- 6.0

## Figures

(a) The BK-1 water hexamer, with molecule numbering indicated. (b) Illustration of the atom sets defined in Sec. III B , with one possible choice of the active, border, and distant atoms indicated.

(a) The BK-1 water hexamer, with molecule numbering indicated. (b) Illustration of the atom sets defined in Sec. III B , with one possible choice of the active, border, and distant atoms indicated.

(a) HF-in-HF embedding error for molecule 1 of the BK-1 water hexamer. The solid curve provides the supermolecular embedding results, while the results of naive truncation of the AO basis set are shown in the dashed curve. Also shown is the effect of partitioning the projection operator for HF-in-HF embedding in the truncated basis set, with either {μ′, μ″} = {μ, 0} (dashed-dotted) or {μ′, μ″} = {106, μ} (crosses). (b) The corresponding truncation error for the CCSD(T)-in-HF truncated embedding calculations.

(a) HF-in-HF embedding error for molecule 1 of the BK-1 water hexamer. The solid curve provides the supermolecular embedding results, while the results of naive truncation of the AO basis set are shown in the dashed curve. Also shown is the effect of partitioning the projection operator for HF-in-HF embedding in the truncated basis set, with either {μ′, μ″} = {μ, 0} (dashed-dotted) or {μ′, μ″} = {106, μ} (crosses). (b) The corresponding truncation error for the CCSD(T)-in-HF truncated embedding calculations.

(a) The number of MOs assigned to as a function of τ for the BK-1 conformation of the water hexamer. The sets of active and border atoms correspond to the case shown in Fig. 1(b) . (b) The absolute error in the HF-in-HF embedding calculation as a function of τ. The data point on the far right is equivalent to the dashed-dotted curve in Fig. 2(a) at μ′ = 106, while the data point on the far left is equivalent to the cross at μ″ = 106. Thus changing τ corresponds to switching between the dashed-dotted curve and the set of crosses in Fig. 2(a) . (c) The absolute error in the HF-in-HF embedding calculation as a function of the border atom cutoff, R O-O.

(a) The number of MOs assigned to as a function of τ for the BK-1 conformation of the water hexamer. The sets of active and border atoms correspond to the case shown in Fig. 1(b) . (b) The absolute error in the HF-in-HF embedding calculation as a function of τ. The data point on the far right is equivalent to the dashed-dotted curve in Fig. 2(a) at μ′ = 106, while the data point on the far left is equivalent to the cross at μ″ = 106. Thus changing τ corresponds to switching between the dashed-dotted curve and the set of crosses in Fig. 2(a) . (c) The absolute error in the HF-in-HF embedding calculation as a function of the border atom cutoff, R O-O.

(a) The effect of varying R O-O on the HF-in-HF/cc-pVDZ embedding energy of the BK-1 conformation of the water hexamer. Each curve corresponds to assigning the constituent atoms of the indicated molecule as the set of active atoms. For a cutoff of 2.0 Å, the calculation is equivalent to a monomolecular calculation using TF embedding and no projection operator. At 6.0 Å, all of the calculations are performed in the supermolecular basis, and the projection operator is used exclusively with no approximate functionals. (b) The corresponding CCSD(T)-in-HF/cc-pVDZ results. (c) The corresponding HF-in-HF/aug-cc-pVDZ results. (d) The corresponding CCSD(T)-in-HF/aug-cc-pVDZ results.

(a) The effect of varying R O-O on the HF-in-HF/cc-pVDZ embedding energy of the BK-1 conformation of the water hexamer. Each curve corresponds to assigning the constituent atoms of the indicated molecule as the set of active atoms. For a cutoff of 2.0 Å, the calculation is equivalent to a monomolecular calculation using TF embedding and no projection operator. At 6.0 Å, all of the calculations are performed in the supermolecular basis, and the projection operator is used exclusively with no approximate functionals. (b) The corresponding CCSD(T)-in-HF/cc-pVDZ results. (c) The corresponding HF-in-HF/aug-cc-pVDZ results. (d) The corresponding CCSD(T)-in-HF/aug-cc-pVDZ results.

The Gly-Gly-Gly-Gly tetrapeptide, with the set of active atoms comprised of the Gly2 residue (solid red box). Each of the dashed boxes indicates the union of the sets of active and border atoms for the corresponding value of n t; any atoms outside of the boxes are included in the set of distant atoms.

The Gly-Gly-Gly-Gly tetrapeptide, with the set of active atoms comprised of the Gly2 residue (solid red box). Each of the dashed boxes indicates the union of the sets of active and border atoms for the corresponding value of n t; any atoms outside of the boxes are included in the set of distant atoms.

(a) τ-dependence of the number of projected orbitals within MP2-in-HF/aug-cc-pVDZ truncated embedding calculations on the Gly-Gly-Gly-Gly tetrapeptide with n t = 3. The choice of active and border atoms is indicated in Fig. 5 . (b) μ′-dependence of the truncation error of this calculation for several values of τ.

(a) τ-dependence of the number of projected orbitals within MP2-in-HF/aug-cc-pVDZ truncated embedding calculations on the Gly-Gly-Gly-Gly tetrapeptide with n t = 3. The choice of active and border atoms is indicated in Fig. 5 . (b) μ′-dependence of the truncation error of this calculation for several values of τ.

(a) Convergence of the truncation error of embedding calculations on the Gly-Gly-Gly-Gly tetrapeptide using the cc-pVDZ basis set and several values of n t. In each curve, the set of active atoms corresponds to the indicated residue. For n t = 9, there are no distant atoms in any of the calculations. (b) The corresponding calculation using the aug-cc-pVDZ basis set. The inset shows the same results on a larger scale.

(a) Convergence of the truncation error of embedding calculations on the Gly-Gly-Gly-Gly tetrapeptide using the cc-pVDZ basis set and several values of n t. In each curve, the set of active atoms corresponds to the indicated residue. For n t = 9, there are no distant atoms in any of the calculations. (b) The corresponding calculation using the aug-cc-pVDZ basis set. The inset shows the same results on a larger scale.

(a) Energies of water hexamer conformations obtained using both CCSD(T) over the full system and CCSD(T)-in-HF supermolecular EMBE2 calculations. Three different basis sets are employed, with the cc-pVDZ and aug-cc-pVDZ basis sets abbreviated as VDZ and AVDZ, respectively. Conformation energies are reported with respect to the corresponding calculation for Conf. 11. (b) Error in the energy of the EMBE2 calculations.

(a) Energies of water hexamer conformations obtained using both CCSD(T) over the full system and CCSD(T)-in-HF supermolecular EMBE2 calculations. Three different basis sets are employed, with the cc-pVDZ and aug-cc-pVDZ basis sets abbreviated as VDZ and AVDZ, respectively. Conformation energies are reported with respect to the corresponding calculation for Conf. 11. (b) Error in the energy of the EMBE2 calculations.

Energies of water hexamer conformations obtained using both CCSD(T)-in-HF supermolecular EMBE2 calculations and CCSD(T)-in-HF truncated EMBE2 calculations. The embedding calculations employ truncated embedding with a border atom cutoff of either R O-O = 0 Å or R O-O = 3 Å. Conformation energies are reported with respect to the corresponding calculation for Conf. 11.

Energies of water hexamer conformations obtained using both CCSD(T)-in-HF supermolecular EMBE2 calculations and CCSD(T)-in-HF truncated EMBE2 calculations. The embedding calculations employ truncated embedding with a border atom cutoff of either R O-O = 0 Å or R O-O = 3 Å. Conformation energies are reported with respect to the corresponding calculation for Conf. 11.

Three of the Gly-Gly-Gly (GGG) tripeptide conformations are presented on the left for several different dihedral angles. The geometries of the Val-Pro-Leu (YPL) and Tyr-Pro-Tyr (YPY) tripeptides are presented on the right.

Three of the Gly-Gly-Gly (GGG) tripeptide conformations are presented on the left for several different dihedral angles. The geometries of the Val-Pro-Leu (YPL) and Tyr-Pro-Tyr (YPY) tripeptides are presented on the right.

(a) Gly-Gly-Gly tripeptide conformation energies obtained using MP2-in-HF EMBE2 calculations and employing the cc-pVDZ basis. Conformation energies are reported with respect to the corresponding calculation for the Ω = 180° conformation. The results using n t = 4 are not shown for this basis set, as they are nearly indistinguishable from the supermolecular results. (b) The corresponding results employing the aug-cc-pVDZ basis. The results using n t = 1 are not shown for this basis set, as they are highly inaccurate.

(a) Gly-Gly-Gly tripeptide conformation energies obtained using MP2-in-HF EMBE2 calculations and employing the cc-pVDZ basis. Conformation energies are reported with respect to the corresponding calculation for the Ω = 180° conformation. The results using n t = 4 are not shown for this basis set, as they are nearly indistinguishable from the supermolecular results. (b) The corresponding results employing the aug-cc-pVDZ basis. The results using n t = 1 are not shown for this basis set, as they are highly inaccurate.

## Tables

Summary of the EMBE2 results for the water hexamer test set. Results are listed using truncated embedding with a cutoff of R O-O = 0 Å, truncated embedding with a cutoff of R O-O = 3 Å, and supermolecular embedding. All values are in kcal/mol.

Summary of the EMBE2 results for the water hexamer test set. Results are listed using truncated embedding with a cutoff of R O-O = 0 Å, truncated embedding with a cutoff of R O-O = 3 Å, and supermolecular embedding. All values are in kcal/mol.

Summary of the EMBE2 results for the Gly-Gly-Gly tripeptide. All calculations use either the cc-pVDZ (VDZ) basis set or the aug-cc-pVDZ (AVDZ) basis set. Results are provided for several values of n t, as well as for the supermolecular basis set (Super.). Both the mean unsigned MBE error over all values of Ω and the standard deviation of the MBE error are provided. All values are reported in kcal/mol.

Summary of the EMBE2 results for the Gly-Gly-Gly tripeptide. All calculations use either the cc-pVDZ (VDZ) basis set or the aug-cc-pVDZ (AVDZ) basis set. Results are provided for several values of n t, as well as for the supermolecular basis set (Super.). Both the mean unsigned MBE error over all values of Ω and the standard deviation of the MBE error are provided. All values are reported in kcal/mol.

The MBE error (Eq. (24) ) for the Val-Pro-Lue tripeptide, the Tyr-Pro-Tyr tripeptide, and the Gly-Gly-Gly-Gly tetrapeptide EMBE2 calculations. All calculations use either the cc-pVDZ (VDZ) basis set or the aug-cc-pVDZ (AVDZ) basis set. Results are provided for several values of n t, as well as for the supermolecular basis set (Super.). All values are reported in kcal/mol.

The MBE error (Eq. (24) ) for the Val-Pro-Lue tripeptide, the Tyr-Pro-Tyr tripeptide, and the Gly-Gly-Gly-Gly tetrapeptide EMBE2 calculations. All calculations use either the cc-pVDZ (VDZ) basis set or the aug-cc-pVDZ (AVDZ) basis set. Results are provided for several values of n t, as well as for the supermolecular basis set (Super.). All values are reported in kcal/mol.

Article metrics loading...

Full text loading...

Commenting has been disabled for this content