^{1}, Matthias Beer

^{1}, Daniel S. Lambrecht

^{1,a)}and Christian Ochsenfeld

^{1,b)}

### Abstract

We present a linear-scaling symmetry-adapted perturbation theory (SAPT) method that is based on an atomic orbital (AO) formulation of zeroth-order SAPT (SAPT0). The non-dispersive terms are realized with linear-scaling cost using both the continuous fast multipole method (CFMM) and the linear exchange (LinK) approach for integral contractions as well as our efficient Laplace-based coupled-perturbed self-consistent field method (DL-CPSCF) for evaluating response densities. The reformulation of the dispersion term is based on our linear-scaling AO Møller-Plesset second-order perturbation theory (AO-MP2) method, that uses our recently introduced QQR-type screening [S. A. Maurer, D. S. Lambrecht, J. Kussmann, and C. Ochsenfeld, J. Chem. Phys.138, 014101 (2013)] for preselecting numerically significant energy contributions. Similar to scaled opposite-spin MP2, we neglect the exchange-dispersion term in SAPT and introduce a scaling factor for the dispersion term, which compensates for the error and at the same time accounts for basis set incompleteness effects and intramonomer correlation. We show in extensive benchmark calculations that the new scaled-dispersion (sd-)SAPT0 approach provides reliable results for small and large interacting systems where the results with a small 6-31G** basis are roughly comparable to supermolecular MP2 calculations in a triple-zeta basis. The performance of our method is demonstrated with timings on cellulose fragments, DNA systems, and cutouts of a protein-ligand complex with up to 1100 atoms on a single computer core.

C.O. acknowledges financial support by the Volkswagen Stiftung within the funding initiative “New Conceptual Approaches to Modeling and Simulation of Complex Systems,” by the SFB 749 “Dynamik und Intermediate molekularer Transformationen” (DFG, Project C7), and the DFG cluster of excellence EXC 114 “Center for Integrative Protein Science Munich” (CIPSM).

I. INTRODUCTION

II. LINEAR-SCALING SCALED-DISPERSION SAPT0

III. COMPUTATIONAL DETAILS

IV. RESULTS

A. S66/S66x8 benchmarks

B. Protein-ligand interactions

V. TIMINGS

VI. CONCLUSION

### Key Topics

- Polymers
- 15.0
- Basis sets
- 12.0
- Perturbation theory
- 7.0
- DNA
- 6.0
- Protein ligand interactions
- 5.0

## Figures

CPU times for sd-SAPT0 calculations on DNA systems with the 6-31G** basis. The calculation times for the SCF calculations on the monomers, the non-dispersive terms (Eq. (2) ) as well as for the dispersion term (Eq. (5) ) are given.

## Tables

Optimized values of the scaling factor for the dispersion term and the corresponding root-mean-square deviation (RMSD) from the reference values in kcal/mol for the S22 training set.

Optimized values of the scaling factor for the dispersion term and the corresponding root-mean-square deviation (RMSD) from the reference values in kcal/mol for the S22 training set.

Root-mean-square deviation (RMSD), mean absolute deviation (MAD), and maximum deviation (MAX) from the reference results for the S66 test set in kcal/mol.

Root-mean-square deviation (RMSD), mean absolute deviation (MAD), and maximum deviation (MAX) from the reference results for the S66 test set in kcal/mol.

Root-mean-square deviation (RMSD), mean absolute deviation (MAD), and maximum deviation (MAX) from the reference results for the S66x8 test set in kcal/mol. The RMSD values of the MP2 and SCS(MI)-MP2 CBS results are taken from Ref. 47 .

Root-mean-square deviation (RMSD), mean absolute deviation (MAD), and maximum deviation (MAX) from the reference results for the S66x8 test set in kcal/mol. The RMSD values of the MP2 and SCS(MI)-MP2 CBS results are taken from Ref. 47 .

Interaction energies ΔE PL and errors compared to the protein-ligand interaction test set of Antony et al. ^{50} Only those systems are listed for which reference values are available (see the supplementary material ^{51} for the full table). The basis set used for sd-SAPT0 is 6-31G**, while the RI-MP2 and SCS(MI)-MP2 results both use the much larger cc-pVTZ basis in the RI-approximation and include counterpoise correction.

Interaction energies ΔE PL and errors compared to the protein-ligand interaction test set of Antony et al. ^{50} Only those systems are listed for which reference values are available (see the supplementary material ^{51} for the full table). The basis set used for sd-SAPT0 is 6-31G**, while the RI-MP2 and SCS(MI)-MP2 results both use the much larger cc-pVTZ basis in the RI-approximation and include counterpoise correction.

CPU times in hours for sd-SAPT0 calculations on two cellulose systems with the 6-31G** basis. The interaction between a small 2-unit fragment of one strain and either a 12 or 20 unit fragment of an adjacent strain is calculated and compared. The number of electrons and atoms is given for the large fragment. Timings for the SCF calculations on the monomers, the non-dispersive SAPT terms (Eq. (2) ), and the dispersion contribution (Eq. (5) ) as well as the scaling with respect to the number of electrons are given. The 2-unit cellulose block is located abreast of one end of the adjacent fragment, so that the computational cost for the sd-SAPT0 calculation scales sublinear with the size of the larger fragment. The SCF calculation of the larger fragment scales linear with its size, while the SCF time for the 2-unit fragment increases due to the increasing size of the dimer-centered basis.

CPU times in hours for sd-SAPT0 calculations on two cellulose systems with the 6-31G** basis. The interaction between a small 2-unit fragment of one strain and either a 12 or 20 unit fragment of an adjacent strain is calculated and compared. The number of electrons and atoms is given for the large fragment. Timings for the SCF calculations on the monomers, the non-dispersive SAPT terms (Eq. (2) ), and the dispersion contribution (Eq. (5) ) as well as the scaling with respect to the number of electrons are given. The 2-unit cellulose block is located abreast of one end of the adjacent fragment, so that the computational cost for the sd-SAPT0 calculation scales sublinear with the size of the larger fragment. The SCF calculation of the larger fragment scales linear with its size, while the SCF time for the 2-unit fragment increases due to the increasing size of the dimer-centered basis.

CPU times in hours for sd-SAPT0 calculations on the repair enzyme MutM in complex with an 8oxoG lesion. The basis set is 6-31G**. The number of electrons and atoms is given for the enzyme cutout. Timings for the SCF calculations on the monomers, the non-dispersive SAPT terms (Eq. (2) ), and the dispersion contribution (Eq. (5) ) as well as the scaling with respect to the number of electrons are given. The increasing cost for the SCF calculation of the 8-oxoG lesion is due to the increasing size of the dimer-centered basis.

CPU times in hours for sd-SAPT0 calculations on the repair enzyme MutM in complex with an 8oxoG lesion. The basis set is 6-31G**. The number of electrons and atoms is given for the enzyme cutout. Timings for the SCF calculations on the monomers, the non-dispersive SAPT terms (Eq. (2) ), and the dispersion contribution (Eq. (5) ) as well as the scaling with respect to the number of electrons are given. The increasing cost for the SCF calculation of the 8-oxoG lesion is due to the increasing size of the dimer-centered basis.

Article metrics loading...

Full text loading...

Commenting has been disabled for this content