^{1}, Illia Horenko

^{2}, Christof Schütte

^{2}and Jeremy C. Smith

^{3}

### Abstract

Molecular dynamics simulation generates large quantities of data that must be interpreted using physically meaningful analysis. A common approach is to describe the system dynamics in terms of transitions between coarse partitions of conformational space. In contrast to previous work that partitions the space according to geometric proximity, the authors examine here clustering based on kinetics, merging configurational microstates together so as to identify long-lived, i.e., dynamically metastable, states. As test systems microsecond molecular dynamics simulations of the polyalanines and are analyzed. Both systems clearly exhibit metastability, with some kinetically distinct metastable states being geometrically very similar. Using the backbone torsion rotamer pattern to define the microstates, a definition is obtained of metastable states whose lifetimes considerably exceed the memory associated with interstate dynamics, thus allowing the kinetics to be described by a Markov model. This model is shown to be valid by comparison of its predictions with the kinetics obtained directly from the molecular dynamics simulations. In contrast, clustering based on the hydrogen-bonding pattern fails to identify long-lived metastable states or a reliable Markov model. Finally, an approach is proposed to generate a hierarchical model of networks, each having a different number of metastable states. The model hierarchy yields a qualitative understanding of the multiple time and length scales in the dynamics of biomolecules.

The authors wish to thank John D. Chodera (UC San Francisco) for carefully and critically reading the manuscript and suggesting many helpful improvements. Furthermore, the authors gratefully acknowledge the funding provided by Center for Modeling and Simulation in the Biosciences, Heidelberg [to one of the authors (F.N.)] and the DFG research program “SFB450: Analysis and Control of ultrafast photoinduced reactions” [to another author (I.H.)].

I. INTRODUCTION

II. METHODS

A. Energy function and system setup

B. Definition of discrete microstates

1. Torsion-rotamer-pattern microstates and independence of coordinates

2. Hydrogen-bonding-pattern microstates and transition state hysteresis

C. Kinetic clustering to obtain metastable states

D. Choosing the number of metastable states

E. Quantitative characterization of metastable states

1. Probability, free energy, and entropy

2. Mean lifetime

F. Testing for the Markov property

G. Hierarchical transition network analysis

III. RESULTS

A. Octaalanine

B. Dodecaalanine

IV. CONCLUSIONS

### Key Topics

- Molecular dynamics
- 30.0
- Markov processes
- 28.0
- Eigenvalues
- 26.0
- Hydrogen bonding
- 26.0
- Cluster analysis
- 25.0

## Figures

(Color) Illustration of the PCCA clustering procedure on a sample potential. (a) Potential in units of , defined over a discrete coordinate with 100 boxes. (b) Transition matrix, , for a Markov chain sampling the potential, using a lag time of steps. Each matrix entry represents the transition probability from cell to cell within time (blue: , red: ). The Markov chain was generated by using a Metropolis Monte Carlo where in each step only jumps to the current and the adjacent boxes were considered. The nearly block-diagonal structure of is apparent. (c) Left eigenvectors of used to identify metastable states. The first eigenvector gives the stationary distribution. The sign structure of the second eigenvector decomposes the state space into two metstable states (thick magenta line). The sign structure of the third eigenvector further splits the right metastable state (thin magenta line), obtaining three metastable states. (d) The eigenvalue spectrum of , indicating how many states are metastable. There are clear gaps after two and three eigenvalues.

(Color) Illustration of the PCCA clustering procedure on a sample potential. (a) Potential in units of , defined over a discrete coordinate with 100 boxes. (b) Transition matrix, , for a Markov chain sampling the potential, using a lag time of steps. Each matrix entry represents the transition probability from cell to cell within time (blue: , red: ). The Markov chain was generated by using a Metropolis Monte Carlo where in each step only jumps to the current and the adjacent boxes were considered. The nearly block-diagonal structure of is apparent. (c) Left eigenvectors of used to identify metastable states. The first eigenvector gives the stationary distribution. The sign structure of the second eigenvector decomposes the state space into two metstable states (thick magenta line). The sign structure of the third eigenvector further splits the right metastable state (thin magenta line), obtaining three metastable states. (d) The eigenvalue spectrum of , indicating how many states are metastable. There are clear gaps after two and three eigenvalues.

Mean transition time for state decompositions using the and dynamics.

Mean transition time for state decompositions using the and dynamics.

Examples of decay curves of metastable state populations. The brown circles and crosses show the fraction of the initial population remaining in a given state for a given time. The selected states are the global energy minima of following metastable state partitions: : ten states for torsion microstates and five states for hydrogen-bond microstates; : eight states for torsion microstates and five states for hydrogen-bond microstates.

Examples of decay curves of metastable state populations. The brown circles and crosses show the fraction of the initial population remaining in a given state for a given time. The selected states are the global energy minima of following metastable state partitions: : ten states for torsion microstates and five states for hydrogen-bond microstates; : eight states for torsion microstates and five states for hydrogen-bond microstates.

(Color) Hierarchical transition network analysis for the peptide based on backbone torsion rotamer microstates. Each bullet and the structure next to it corresponds to one metastable set of backbone torsion rotamer patterns. The bullets contain the state name (a letter), the free energy in kcal/mol (upper number), and the mean lifetime in picoseconds (lower number). Each structure is shown by a few representative tubes and an overlay of 100 examples randomly drawn from the ensemble of structures of each state, shown as line representations of the backbone. A pair of states is connected if at least one transition between these states was observed in the trajectory. The hierarchical relationship between the three networks for , 6, and 10 metastable states is indicated by the dotted arrows. Each arrow starts at the metastable state in the higher-order network which contains the majority of microstates in the state the arrow points to. For example, the microstates of state in the network are split into three substates, , , and , in the network.

(Color) Hierarchical transition network analysis for the peptide based on backbone torsion rotamer microstates. Each bullet and the structure next to it corresponds to one metastable set of backbone torsion rotamer patterns. The bullets contain the state name (a letter), the free energy in kcal/mol (upper number), and the mean lifetime in picoseconds (lower number). Each structure is shown by a few representative tubes and an overlay of 100 examples randomly drawn from the ensemble of structures of each state, shown as line representations of the backbone. A pair of states is connected if at least one transition between these states was observed in the trajectory. The hierarchical relationship between the three networks for , 6, and 10 metastable states is indicated by the dotted arrows. Each arrow starts at the metastable state in the higher-order network which contains the majority of microstates in the state the arrow points to. For example, the microstates of state in the network are split into three substates, , , and , in the network.

Scheme illustrating the concept of a kinetic trap in the absence of a predefined native structure. The free energy minimum is understood as the “native state” while high-energy minima that are separated from the main basins with high barriers are defined as “kinetic traps.” Such traps are rarely visited, but when they are visited, they are stable for a long time.

Scheme illustrating the concept of a kinetic trap in the absence of a predefined native structure. The free energy minimum is understood as the “native state” while high-energy minima that are separated from the main basins with high barriers are defined as “kinetic traps.” Such traps are rarely visited, but when they are visited, they are stable for a long time.

(Color) Hierarchical transition network analysis for the peptide for two, three, and five metastable sets containing hydrogen-bonding pattern microstates. See caption of Fig. 4 for a complete description.

(Color) Hierarchical transition network analysis for the peptide for two, three, and five metastable sets containing hydrogen-bonding pattern microstates. See caption of Fig. 4 for a complete description.

Scheme illustrating why some definitions of microstates are better than others. Set (a) is optimal because, when partitioned into metastable sets, it splits exactly on the transition state. Set (b) is inappropriate because one of its states reaches across the transition state, including states from both basins. Such a definition yields shorter lifetimes because transitions into and out of cell are frequent from both basins, even though no actual transition may occur. This also produces apparent connections between metastable states that are actually not directly connected.

Scheme illustrating why some definitions of microstates are better than others. Set (a) is optimal because, when partitioned into metastable sets, it splits exactly on the transition state. Set (b) is inappropriate because one of its states reaches across the transition state, including states from both basins. Such a definition yields shorter lifetimes because transitions into and out of cell are frequent from both basins, even though no actual transition may occur. This also produces apparent connections between metastable states that are actually not directly connected.

(Color) Hierarchical transition network analysis for the peptide for two, six, and eight metastable sets containing backbone torsion rotamer microstates. See caption of Fig. 4 for a complete description.

(Color) Hierarchical transition network analysis for the peptide for two, six, and eight metastable sets containing backbone torsion rotamer microstates. See caption of Fig. 4 for a complete description.

(Color) Hierarchical transition network analysis for the peptide for two, three, and five metastable sets containing hydrogen-bonding pattern microstates. See caption of Fig. 4 for a complete description.

(Color) Transition network analysis for the peptide for 20 metastable sets containing backbone torsion rotamer microstates. See caption of Fig. 4 for a complete description.

(Color) Transition network analysis for the peptide for 20 metastable sets containing backbone torsion rotamer microstates. See caption of Fig. 4 for a complete description.

(Color) The implied time scales of the processes associated with individual eigenvectors, depending on the lag time , computed as , where is the eigenvalue. The implied time scales for the torsion rotamer states, [(a) and (c)] become flat for most processes at around , which is much below the lifetimes of the metastable states, indicating that the interstate transitions are Markovian. This is not the case for the hydrogen-bond state definition, [(b) and (d)] whose time scales do not converge within .

(Color) The implied time scales of the processes associated with individual eigenvectors, depending on the lag time , computed as , where is the eigenvalue. The implied time scales for the torsion rotamer states, [(a) and (c)] become flat for most processes at around , which is much below the lifetimes of the metastable states, indicating that the interstate transitions are Markovian. This is not the case for the hydrogen-bond state definition, [(b) and (d)] whose time scales do not converge within .

(Color) Comparison of the molecular dynamics simulations (solid lines) with the Markov dynamics produced by the model network of metastable states (dashed lines) for a lag time of . The Markov dynamics is run times, each time initializing a single state with population 1 and the remaining states with population 0. The dashed curves show the relaxation of the selected state’s population towards equilibrium. For comparison, the relaxation dynamics directly computed from the molecular dynamics trajectory are shown. While the Markov model based H-bond definition fails for all states, the one based on the torsion rotamer definition fits well for many states and for all states in the high population range. The mismatch for some states in the low-population range [states and for and states , , , and for are likely to be due to insufficient sampling in the molecular dynamics trajectory (see number of long-time visits in Tables I and III]. The mismatch for states and which are frequently visited indicates the presence of considerable internal energy barriers, which means that these states should be further decomposed into substates (by increasing the number of metastable states in the model) in order to improve the fit.

(Color) Comparison of the molecular dynamics simulations (solid lines) with the Markov dynamics produced by the model network of metastable states (dashed lines) for a lag time of . The Markov dynamics is run times, each time initializing a single state with population 1 and the remaining states with population 0. The dashed curves show the relaxation of the selected state’s population towards equilibrium. For comparison, the relaxation dynamics directly computed from the molecular dynamics trajectory are shown. While the Markov model based H-bond definition fails for all states, the one based on the torsion rotamer definition fits well for many states and for all states in the high population range. The mismatch for some states in the low-population range [states and for and states , , , and for are likely to be due to insufficient sampling in the molecular dynamics trajectory (see number of long-time visits in Tables I and III]. The mismatch for states and which are frequently visited indicates the presence of considerable internal energy barriers, which means that these states should be further decomposed into substates (by increasing the number of metastable states in the model) in order to improve the fit.

## Tables

Quantitative description of the metastable states found by grouping backbone torsion rotamers in . , name of the state. , free energy difference with respect to the free energy minimum (state ) and its error. , potential energy difference. is the free energy difference due to entropy. All energies are given in kcal/mol, the uncertainties in are negligible, and thus the uncertainties in are identical to the uncertainties . , mean lifetime of the state in picoseconds and its error. , number of visits in the state long enough to be useful to fitting the mean lifetime. The lines are ordered so as to indicate how states split when going to a network with more metastable states, e.g., state for splits up into , , and for .

Quantitative description of the metastable states found by grouping backbone torsion rotamers in . , name of the state. , free energy difference with respect to the free energy minimum (state ) and its error. , potential energy difference. is the free energy difference due to entropy. All energies are given in kcal/mol, the uncertainties in are negligible, and thus the uncertainties in are identical to the uncertainties . , mean lifetime of the state in picoseconds and its error. , number of visits in the state long enough to be useful to fitting the mean lifetime. The lines are ordered so as to indicate how states split when going to a network with more metastable states, e.g., state for splits up into , , and for .

Quantitative description of the metastable states found by grouping hydrogen-bonding patterns in . See Table I for a complete description.

Quantitative description of the metastable states found by grouping hydrogen-bonding patterns in . See Table I for a complete description.

Quantitative description of the metastable states found by grouping torsion rotamers in . See Table I for a complete description.

Quantitative description of the metastable states found by grouping torsion rotamers in . See Table I for a complete description.

Quantitative description of the metastable states found by grouping hydrogen-bonding patterns in . See Table I for a complete description.

Article metrics loading...

Full text loading...

Commenting has been disabled for this content