^{1,a)}and Toby W. Allen

^{2,b)}

### Abstract

Free energy perturbation, a method for computing the free energy difference between two states, is often combined with non-Boltzmann biased sampling techniques in order to accelerate the convergence of free energy calculations. Here we present a new extension of the Bennett acceptance ratio (BAR) method by combining it with umbrella sampling (US) along a reaction coordinate in configurational space. In this approach, which we call Bennett acceptance ratio with umbrella sampling (BAR-US), the conditional histogram of energy difference (a mapping of the 3N-dimensional configurational space via a reaction coordinate onto 1D energy difference space) is weighted for marginalization with the associated population density along a reaction coordinate computed by US. This procedure produces marginal histograms of energy difference, from forward and backward simulations, with higher overlap in energy difference space, rendering free energy difference estimations using BAR statistically more reliable. In addition to BAR-US, two histogram analysis methods, termed Bennett overlapping histograms with US (BOH-US) and Bennett-Hummer (linear) least square with US (BHLS-US), are employed as consistency and convergence checks for free energy difference estimation by BAR-US. The proposed methods (BAR-US, BOH-US, and BHLS-US) are applied to a 1-dimensional asymmetric model potential, as has been used previously to test free energy calculations from non-equilibrium processes. We then consider the more stringent test of a 1-dimensional strongly (but linearly) shifted harmonic oscillator, which exhibits no overlap between two states when sampled using unbiased Brownian dynamics. We find that the efficiency of the proposed methods is enhanced over the original Bennett's methods (BAR, BOH, and BHLS) through fast uniform sampling of energy difference space via US in configurational space. We apply the proposed methods to the calculation of the electrostatic contribution to the absolute solvation free energy (excess chemical potential) of water. We then address the controversial issue of ion selectivity in the K^{+}ion channel, KcsA. We have calculated the relative binding affinity of K^{+} over Na^{+} within a binding site of the KcsA channel for which different, though adjacent, K^{+} and Na^{+} configurations exist, ideally suited to these US-enhanced methods. Our studies demonstrate that the significant improvements in free energy calculations obtained using the proposed methods can have serious consequences for elucidating biological mechanisms and for the interpretation of experimental data.

I. Kim appreciates Professor Alexei Stuchebrukhov for his introduction to Jarzynski work-fluctuation theorem and to the shifted harmonic oscillator model used in this paper. We acknowledge National Science Foundation (NSF) Award Nos. MCB-0546768 and MCB-1052477 for funding.

I. INTRODUCTION

II. METHODS

A. Umbrella sampling

B. Bennett acceptance ratio and histogram analysis enhanced by US

III. NUMERICAL TESTS

A. Hummer 1D asymmetric potential

B. A strongly (but linearly) shifted harmonic oscillator

IV. APPLICATIONS

A. The electrostatic contribution to the excess chemical potential of water

B. Ionic selectivity of the S2 site in the KcsA potassium channel

V. CONCLUSION

### Key Topics

- Free energy
- 120.0
- Sodium
- 33.0
- Binding sites
- 12.0
- Chemical potential
- 10.0
- Potassium
- 7.0

## Figures

(a) The histograms (or unnormalized densities) of energy difference collected by sampling from initial (λ = 0) and final (λ = 1) states with no overlap between them. This illustration was taken from the model of a strongly shifted harmonic oscillator, sampled from unbiased Brown dynamics (100 ns for each state), in Sec. III B. (b) The corresponding marginal histograms of energy difference with increased overlap enhanced by US along a reaction coordinate (Q) from the proposed methods resulting in improved statistical reliability for the estimation of free energy difference. The total simulation time was 25 ns for each state (1 ns per umbrella window).

(a) The histograms (or unnormalized densities) of energy difference collected by sampling from initial (λ = 0) and final (λ = 1) states with no overlap between them. This illustration was taken from the model of a strongly shifted harmonic oscillator, sampled from unbiased Brown dynamics (100 ns for each state), in Sec. III B. (b) The corresponding marginal histograms of energy difference with increased overlap enhanced by US along a reaction coordinate (Q) from the proposed methods resulting in improved statistical reliability for the estimation of free energy difference. The total simulation time was 25 ns for each state (1 ns per umbrella window).

(a) Hummer asymmetric 1D model potential with several λ states. (b) Free energy difference (see Eq. (39)) is used to adjust the relative off-set of two PMFs from the initial (λ = 0) and the final (λ = 1) states.

(a) Hummer asymmetric 1D model potential with several λ states. (b) Free energy difference (see Eq. (39)) is used to adjust the relative off-set of two PMFs from the initial (λ = 0) and the final (λ = 1) states.

The distributions of free energies over time for the Hummer asymmetric potential by (a) BAR, (b) BOH, (c) BHLS, (d) BAR-US, (e) BOH-US, and (f) BHLS-US. Each distribution is obtained from 1000 independent runs, whose average free energy is shown in parentheses after the total simulation time. The arrow in figures indicates the numerical solution (∼−6.6 *k* _{B} *T*). Notice the difference of the y-axis scale between unbiased estimators (top) and US-enhanced estimators (bottom). In the absence of overlap between two histograms of energy difference, the free energies for BOH and BHLS are duplicated from those of BAR.

The distributions of free energies over time for the Hummer asymmetric potential by (a) BAR, (b) BOH, (c) BHLS, (d) BAR-US, (e) BOH-US, and (f) BHLS-US. Each distribution is obtained from 1000 independent runs, whose average free energy is shown in parentheses after the total simulation time. The arrow in figures indicates the numerical solution (∼−6.6 *k* _{B} *T*). Notice the difference of the y-axis scale between unbiased estimators (top) and US-enhanced estimators (bottom). In the absence of overlap between two histograms of energy difference, the free energies for BOH and BHLS are duplicated from those of BAR.

The convergence of the free energy for the Hummer asymmetric potential by (a) BAR and (b) BAR-US. In the BAR-US method (b), the filled squares represent total simulation times, each of which corresponds 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5 ns per umbrella window, respectively (17 umbrella windows for each simulation in total). The insets show the number of data (using a 0.01 *k* _{B} *T* bin) in the overlap region for (a) BAR and (b) BAR-US.

The convergence of the free energy for the Hummer asymmetric potential by (a) BAR and (b) BAR-US. In the BAR-US method (b), the filled squares represent total simulation times, each of which corresponds 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5 ns per umbrella window, respectively (17 umbrella windows for each simulation in total). The insets show the number of data (using a 0.01 *k* _{B} *T* bin) in the overlap region for (a) BAR and (b) BAR-US.

Bennett overlapping histograms (BOH) for the Hummer asymmetric potential (top), and US-enhanced BOH (BOH-US) (bottom), where , and βΔ*F* = *P* _{1} − *P* _{0}. (a) 2 ns, (b) 3 ns, (c) 8 ns, (d) 0.085 ns (0.005 ns per umbrella window), (e) 0.85 ns (0.05 ns per umbrella window), and (f) 4.25 ns (0.25 per umbrella window).

Bennett overlapping histograms (BOH) for the Hummer asymmetric potential (top), and US-enhanced BOH (BOH-US) (bottom), where , and βΔ*F* = *P* _{1} − *P* _{0}. (a) 2 ns, (b) 3 ns, (c) 8 ns, (d) 0.085 ns (0.005 ns per umbrella window), (e) 0.85 ns (0.05 ns per umbrella window), and (f) 4.25 ns (0.25 per umbrella window).

Bennett-Hummer (linear) least square (BHLS) for the Hummer asymmetric potential with a total simulation time of (a) 2 ns, (b) 3 ns, and (c) 8 ns. The corresponding BHLS-US with a total simulation time of (d) 0.085 ns (0.005 ns per umbrella window), (e) 0.85 ns (0.05 ns per umbrella window), and (f) 4.25 ns (0.25 ns per umbrella window). The linear least square fit of Eq. (24) produces a linear line whose slope is β and y-intersection is −βΔ*F*.

Bennett-Hummer (linear) least square (BHLS) for the Hummer asymmetric potential with a total simulation time of (a) 2 ns, (b) 3 ns, and (c) 8 ns. The corresponding BHLS-US with a total simulation time of (d) 0.085 ns (0.005 ns per umbrella window), (e) 0.85 ns (0.05 ns per umbrella window), and (f) 4.25 ns (0.25 ns per umbrella window). The linear least square fit of Eq. (24) produces a linear line whose slope is β and y-intersection is −βΔ*F*.

(a) A strongly (but linearly) shifted harmonic oscillator of the initial (λ = 0) and final (λ = 1) state. (b) BOH-US with a total simulation time of 25 ns (1 ns per umbrella window). The average free energy difference from the values of overlap region, Δ*F* (dashed line), is in excellent agreement with the analytic free energy difference 72 *k* _{B} *T*.

(a) A strongly (but linearly) shifted harmonic oscillator of the initial (λ = 0) and final (λ = 1) state. (b) BOH-US with a total simulation time of 25 ns (1 ns per umbrella window). The average free energy difference from the values of overlap region, Δ*F* (dashed line), is in excellent agreement with the analytic free energy difference 72 *k* _{B} *T*.

(a) BOH-US for the model of a strongly shifted harmonic oscillator, where , , βΔ*F* = *P* _{1} − *P* _{0} are compared with the corresponding analytical expressions; *P* _{0} (pink), *P* _{1} (light blue), and Δ*F* (dashed line) are the analytical expressions, respectively. (b) BHLS-US for the same model.

(a) BOH-US for the model of a strongly shifted harmonic oscillator, where , , βΔ*F* = *P* _{1} − *P* _{0} are compared with the corresponding analytical expressions; *P* _{0} (pink), *P* _{1} (light blue), and Δ*F* (dashed line) are the analytical expressions, respectively. (b) BHLS-US for the same model.

The convergence profile of excess chemical potential for uncharging a water molecule in bulk water from (a) BAR and (b) BAR-US.

The convergence profile of excess chemical potential for uncharging a water molecule in bulk water from (a) BAR and (b) BAR-US.

Free energy difference from BOH-US (top) and BHLS-US (bottom) for the electrostatic contribution to the chemical potential of water. The total simulation time is 0.1 ns for (a) and (d), 0.5 ns for (b) and (e), and 3 ns for (c) and (f).

Free energy difference from BOH-US (top) and BHLS-US (bottom) for the electrostatic contribution to the chemical potential of water. The total simulation time is 0.1 ns for (a) and (d), 0.5 ns for (b) and (e), and 3 ns for (c) and (f).

(a) The KcsA potassium ion channel embedded in a hydrated DPPC membrane (gray) for all-atom MD simulations, showing just 2 subunits of the KcsA tetramer (yellow ribbons) for clarity, with the selectivity filter highlighted (sticks) and waters in cavity (red/white sticks), and K^{+} ions located in the pore shown as red spheres. (b) PMF of the initial (K^{+}, red curve) and the final (Na^{+}, blue curve) states across the S2 site from 1D US simulations, along a reaction coordinate (*z*), revealing distinct K^{+} and Na^{+} selective binding sites (sticks). The Na^{+} PMF has been shifted (see Eq. (39)) using a relative free energy difference of ∼0.3 kcal/mol between the S2 site (−18.05 kcal/mol) and bulk water (−18.39 kcal/mol).

(a) The KcsA potassium ion channel embedded in a hydrated DPPC membrane (gray) for all-atom MD simulations, showing just 2 subunits of the KcsA tetramer (yellow ribbons) for clarity, with the selectivity filter highlighted (sticks) and waters in cavity (red/white sticks), and K^{+} ions located in the pore shown as red spheres. (b) PMF of the initial (K^{+}, red curve) and the final (Na^{+}, blue curve) states across the S2 site from 1D US simulations, along a reaction coordinate (*z*), revealing distinct K^{+} and Na^{+} selective binding sites (sticks). The Na^{+} PMF has been shifted (see Eq. (39)) using a relative free energy difference of ∼0.3 kcal/mol between the S2 site (−18.05 kcal/mol) and bulk water (−18.39 kcal/mol).

The convergence profile of the free energy difference in the s2 site.

The convergence profile of the free energy difference in the s2 site.

The free energy difference, Δ*G* _{ Bulk }(K^{+} → Na^{+}), in bulk (top) and the free energy difference, Δ*G* _{ KcsA(S2)}(K^{+} → Na^{+}), across the S2 site in KcsA (bottom), evaluated by BOH-US for (a) and (d), BHLS-US for (b) and (e). (c) and (f) represents the distributions of energy difference histograms where the insets clearly show the overlap of the two histograms, from which the intersection can be used (via Eq. (23)) to compute free energy differences. For clarity, a 0.1 kcal/mol bin is used for all figures. The selectivity (binding free energy difference between Na^{+} and K^{+}) has been calculated relative to bulk water; ΔΔ*G*(K^{+} → Na^{+}) = Δ*G* _{ KcsA(S2)}(K^{+} → Na^{+}) − Δ*G* _{ Bulk }(K^{+} → Na^{+}).

The free energy difference, Δ*G* _{ Bulk }(K^{+} → Na^{+}), in bulk (top) and the free energy difference, Δ*G* _{ KcsA(S2)}(K^{+} → Na^{+}), across the S2 site in KcsA (bottom), evaluated by BOH-US for (a) and (d), BHLS-US for (b) and (e). (c) and (f) represents the distributions of energy difference histograms where the insets clearly show the overlap of the two histograms, from which the intersection can be used (via Eq. (23)) to compute free energy differences. For clarity, a 0.1 kcal/mol bin is used for all figures. The selectivity (binding free energy difference between Na^{+} and K^{+}) has been calculated relative to bulk water; ΔΔ*G*(K^{+} → Na^{+}) = Δ*G* _{ KcsA(S2)}(K^{+} → Na^{+}) − Δ*G* _{ Bulk }(K^{+} → Na^{+}).

## Tables

Free energy differences for the Hummer model potential and the linearly shifted harmonic oscillator numerical tests (in *k* _{B} *T*), and excess chemical potential of water and the thermodynamic selectivity in the S2 site of KcsA applications (in kcal/mol). The errors for the Hummer potential are standard error of means from 1000 independent simulations. Otherwise, errors are evaluated by BAR standard deviations (Eq. (15)) , standard error of means, and correlation coefficients (*R* ^{2}) for BAR(-US), BOH(-US), and BHLS(-US), respectively. The numerical solution for this model is ∼−6.6*k* _{B} *T*.

Free energy differences for the Hummer model potential and the linearly shifted harmonic oscillator numerical tests (in *k* _{B} *T*), and excess chemical potential of water and the thermodynamic selectivity in the S2 site of KcsA applications (in kcal/mol). The errors for the Hummer potential are standard error of means from 1000 independent simulations. Otherwise, errors are evaluated by BAR standard deviations (Eq. (15)) , standard error of means, and correlation coefficients (*R* ^{2}) for BAR(-US), BOH(-US), and BHLS(-US), respectively. The numerical solution for this model is ∼−6.6*k* _{B} *T*.

Comparison of statistical efficiency between unbiased Bennett estimators (BAR, BOH, and BHLS) and US-enhanced Bennett estimators (BAR-US, BOH-US, and BHLS-US) for the Hummer potential with total simulation times for unbiased Bennett estimators and for US-enhanced Bennett estimators. The mean (in *k* _{B} *T*) and standard deviation of free energy distributions over simulation time, shown in Fig. 3, is calculated from 1000 independent simulations. The number in parentheses is standard deviation (error) of each statistics whose value is estimated from the distribution of sample mean, which is obtained by resampling 1000 independent simulations by 1000 times (bootstrapping). The numerical solution for this model is ∼−6.6 *k* _{B} *T*.

Comparison of statistical efficiency between unbiased Bennett estimators (BAR, BOH, and BHLS) and US-enhanced Bennett estimators (BAR-US, BOH-US, and BHLS-US) for the Hummer potential with total simulation times for unbiased Bennett estimators and for US-enhanced Bennett estimators. The mean (in *k* _{B} *T*) and standard deviation of free energy distributions over simulation time, shown in Fig. 3, is calculated from 1000 independent simulations. The number in parentheses is standard deviation (error) of each statistics whose value is estimated from the distribution of sample mean, which is obtained by resampling 1000 independent simulations by 1000 times (bootstrapping). The numerical solution for this model is ∼−6.6 *k* _{B} *T*.

Article metrics loading...

Full text loading...

Commenting has been disabled for this content