^{1}, Eric H. Knoll

^{1,a)}and Yixiang Cao

^{2}

### Abstract

This paper describes an empirical localized orbital correction model which improves the accuracy of density functional theory(DFT) methods for the prediction of thermochemical properties for molecules of first and second row elements. The B3LYP localized orbital correction version of the model improves B3LYP DFT atomization energy calculations on the G3 data set of from a mean absolute deviation (MAD) from experiment of . The almost complete elimination of large outliers and the substantial reduction in MAD yield overall results comparable to the G3 wave-function-based method; furthermore, the new model has zero additional computational cost beyond standard DFT calculations. The following four classes of correction parameters are applied to a molecule based on standard valence bond assignments: corrections to atoms, corrections to individual bonds, corrections for neighboring bonds of a given bond, and radical environmental corrections. Although the model is heuristic and is based on a 22 parameter multiple linear regression to experimental errors, each of the parameters is justified on physical grounds, and each provides insight into the fundamental limitations of DFT, most importantly the failure of current DFT methods to accurately account for nondynamical electron correlation.

This work was supported in part by grants to one of the authors (R.A.F.) from the NIH (GM-40526) and the DOE (DE-FG02 ER 14162). The authors thank Sason Shaik for his suggestions concerning the unusual behavior of fluorine containing bonds.

I. OVERVIEW

II. ANALYSIS OF HYBRID DFT METHODS

A. Historical overview

B. Localized orbital corrections for hybrid DFT

III. DETAILED SPECIFICATION OF THE LOCALIZED ORBITAL CORRECTION (LOC) MODEL

A. Overview

B. Valence bond assignments

C. Corrections for atomic hybridization states

D. Bond corrections

1. Bonds between H and other atoms

2. Bonds between F and all atoms other than H

3. Bonds between atoms other than F and H

4. Correlation of the base single bond corrections with a metric comparing orbital size and bond length

E. Environmental correction terms

F. Summary and analysis of model parameters

IV. RESULTS AND DISCUSSION

V. CONCLUSION

### Key Topics

- Chemical bonds
- 168.0
- Density functional theory
- 68.0
- Electron correlation calculations
- 25.0
- Exchange correlation functionals
- 19.0
- Chemical thermodynamics
- 16.0

## Figures

Valence bond structures and hybridization assignments for “difficult” molecules found in the G3 data set. Also included are examples of charge transfer and octet expansion.

Valence bond structures and hybridization assignments for “difficult” molecules found in the G3 data set. Also included are examples of charge transfer and octet expansion.

Plot of the descriptor of the second moment minus the square of one-half of the bond length versus the single bond parameter from the B3LYP-LOC model. The linear regression excludes the four data points for fluorine containing bonds. Each data point represents an average of single bonds of the given type from a few molecules in the G2 data set. The legend indicates the single bond type and the abbreviation in the LOC model.

Plot of the descriptor of the second moment minus the square of one-half of the bond length versus the single bond parameter from the B3LYP-LOC model. The linear regression excludes the four data points for fluorine containing bonds. Each data point represents an average of single bonds of the given type from a few molecules in the G2 data set. The legend indicates the single bond type and the abbreviation in the LOC model.

Deviation of calculated B3LYP enthalpy of formation from experiment (kcal/mol). Significant improvement results from adding the environmental correction parameter (ESBC) to the B3LYP-LOC model for chlorinated methanes.

Deviation of calculated B3LYP enthalpy of formation from experiment (kcal/mol). Significant improvement results from adding the environmental correction parameter (ESBC) to the B3LYP-LOC model for chlorinated methanes.

All values in kcal/mol. The dramatic improvement over B3LYP deviation for the full G2 and G3 data sets (top histogram) is shown for both using only the extended G2 as a training set (second from top) and the full data set (middle) for the linear fit. The bottom two histograms show the negligible improvement with the G3-LOC model relative to the G3 without the model. Comparison of the B3LYP-LOC histograms with the G3 and G3-LOC demonstrates the quantitative improvement better than G3 accuracy of the B3LYP-LOC method.

All values in kcal/mol. The dramatic improvement over B3LYP deviation for the full G2 and G3 data sets (top histogram) is shown for both using only the extended G2 as a training set (second from top) and the full data set (middle) for the linear fit. The bottom two histograms show the negligible improvement with the G3-LOC model relative to the G3 without the model. Comparison of the B3LYP-LOC histograms with the G3 and G3-LOC demonstrates the quantitative improvement better than G3 accuracy of the B3LYP-LOC method.

All numbers in kcal/mol. The left two scatter plots show the weakness of the G3-LOC model in significantly improving the G3 error. While the G3-LOC has a low adjusted value, this value for B3LYP-LOC and the scatter plots on the right show the statistical significance of the B3LYP-LOC model. All plots are for a multiple linear regression on the full G3 data set.

All numbers in kcal/mol. The left two scatter plots show the weakness of the G3-LOC model in significantly improving the G3 error. While the G3-LOC has a low adjusted value, this value for B3LYP-LOC and the scatter plots on the right show the statistical significance of the B3LYP-LOC model. All plots are for a multiple linear regression on the full G3 data set.

## Tables

A comparison of and for several molecules with C–H or C—C bonds. is the second moment of the boys’ localized orbital. is one-half the bond length squared.

A comparison of and for several molecules with C–H or C—C bonds. is the second moment of the boys’ localized orbital. is one-half the bond length squared.

Two of the four types of B3LYP atomization energy correction factors for the localized orbital correction (LOC) model are listed in this table. Only the types of corrections for atoms, bonds, and bonding environments appearing in the G2 and G3 data sets are covered. Two correction values, both in kcal/mol, are listed. Values of “---” were specifically set to 0 and not used in the linear regression. A positive correction factor implies that the relevant state will be predicted by B3LYP to be overbound by the value in the table, as compared to the experimental data. Part (a) of the table shows correction factors for atoms as a function of hybridization state. part (b) shows base correction parameters for various types of bonds. The description briefly specifies particular bonds belonging to each correction factor. The symbol and refer to any atom other than H or F. The other two types of correction factors (environmental correction for bonds and radicals) and their rules of application are in the text and a subsequent table.

Two of the four types of B3LYP atomization energy correction factors for the localized orbital correction (LOC) model are listed in this table. Only the types of corrections for atoms, bonds, and bonding environments appearing in the G2 and G3 data sets are covered. Two correction values, both in kcal/mol, are listed. Values of “---” were specifically set to 0 and not used in the linear regression. A positive correction factor implies that the relevant state will be predicted by B3LYP to be overbound by the value in the table, as compared to the experimental data. Part (a) of the table shows correction factors for atoms as a function of hybridization state. part (b) shows base correction parameters for various types of bonds. The description briefly specifies particular bonds belonging to each correction factor. The symbol and refer to any atom other than H or F. The other two types of correction factors (environmental correction for bonds and radicals) and their rules of application are in the text and a subsequent table.

Two of the four types of B3LYP atomization energy correction factors for the localized orbital correction (LOC) model are listed in this table. Two correction values are identical to those described in Table II. Atomic hybridization corrections [part (a)] and bond corrections [part (b)] are shown in Table II. The symbols , , and refer to any first or second row atom other than H or F. The rules of application for each of the factors are described in the third column.

Two of the four types of B3LYP atomization energy correction factors for the localized orbital correction (LOC) model are listed in this table. Two correction values are identical to those described in Table II. Atomic hybridization corrections [part (a)] and bond corrections [part (b)] are shown in Table II. The symbols , , and refer to any first or second row atom other than H or F. The rules of application for each of the factors are described in the third column.

Results for the extended G2 test set using the correction parameters defined in Tables II and III. A total of is included in the test set. Molecule 64 was discarded due to a probable inaccuracy in the experimental value. All values are in kcal/mol.

Results for the extended G2 test set using the correction parameters defined in Tables II and III. A total of is included in the test set. Molecule 64 was discarded due to a probable inaccuracy in the experimental value. All values are in kcal/mol.

Column 4 contains the uncorrected B3LYP error, while column 5 contains the results of the G3 data set used as a test set to the B3LYP-LOC model trained only on the extended G2 data in Table IV. Column 6 shows the results of B3LYP-LOC refit to the entire G2 and G3 data sets. Columns 7 and 8 show the G3 error and the G3-LOC error when trained on the entire data set.

Column 4 contains the uncorrected B3LYP error, while column 5 contains the results of the G3 data set used as a test set to the B3LYP-LOC model trained only on the extended G2 data in Table IV. Column 6 shows the results of B3LYP-LOC refit to the entire G2 and G3 data sets. Columns 7 and 8 show the G3 error and the G3-LOC error when trained on the entire data set.

Summary of LOC corrections for other functionals in addition to B3LYP. The data for each of these models are limited to the extended G2 set. All numbers are in kcal/mol.

Summary of LOC corrections for other functionals in addition to B3LYP. The data for each of these models are limited to the extended G2 set. All numbers are in kcal/mol.

Performance of and number of empirical parameters used in the B3LYP-LOC method compared to the G3 method and a few DFT functionals. MAD (G2) refers to the mean absolute deviation (kcal/mol) for the automization energies in the extended G2 data set (except as noted for B97), while MAD (G3) is for the G3 data set. Listed in order to decreasing number of parameters.

Performance of and number of empirical parameters used in the B3LYP-LOC method compared to the G3 method and a few DFT functionals. MAD (G2) refers to the mean absolute deviation (kcal/mol) for the automization energies in the extended G2 data set (except as noted for B97), while MAD (G3) is for the G3 data set. Listed in order to decreasing number of parameters.

Effects of removing various parameters from the B3LYP-LOC fit over the entire G3 data set. A separate linear regression was done for each parameters removal below. MAD and RMSD values are in kcal/mol. Outliers column refers to the number of molecules that have absolute deviations .

Effects of removing various parameters from the B3LYP-LOC fit over the entire G3 data set. A separate linear regression was done for each parameters removal below. MAD and RMSD values are in kcal/mol. Outliers column refers to the number of molecules that have absolute deviations .

Article metrics loading...

Full text loading...

Commenting has been disabled for this content