^{1}and Abhijit Chatterjee

^{1,2,a)}

### Abstract

The kinetic Monte Carlo (KMC) method is a popular modeling approach for reaching large materials length and time scales. The KMC dynamics is erroneous when atomic processes that are relevant to the dynamics are missing from the KMC model. Recently, we had developed for the first time an error measure for KMC in Bhute and Chatterjee [J. Chem. Phys.138, 084103 (Year: 2013)10.1063/1.4792439]. The error measure, which is given in terms of the probability that a missing process will be selected in the correct dynamics, requires estimation of the missing rate. In this work, we present an improved procedure for estimating the missing rate. The estimate found using the new procedure is within an order of magnitude of the correct missing rate, unlike our previous approach where the estimate was larger by orders of magnitude. This enables one to find the error in the KMC model more accurately. In addition, we find the time for which the KMC model can be used before a maximum error in the dynamics has been reached.

We acknowledge helpful discussions with A. F. Voter. A.C. acknowledges support from BRNS Young Scientist Award from the Department of Atomic Energy (DAE-BRNS) No. 2011/36/43-BRNS/1975.

I. INTRODUCTION

II. ERROR IN A KMCMODEL

A. Relevance of missing processes in the correct dynamics

B. Error in the KMCmodel

C. Building a KMCmodel using dynamical BEPS

III. IMPROVED ESTIMATE FOR THE MISSING RATE

A. Rate estimate for inaccessible timescales

B. Rate estimate for a spectral band

C. Extension to multiple spectral bands

IV. ALGORITHM FOR GENERATING A KMCMODEL OF CHOSEN ACCURACY

V. RATE ESTIMATE WHEN PROCESS TIME SCALES OVERLAP

VI. CONCLUSIONS

### Key Topics

- Monte Carlo methods
- 58.0
- Trajectory models
- 8.0
- Chaos
- 2.0
- Molecular dynamics
- 2.0
- Poisson's equation
- 2.0

## Figures

(a) Schematic of a basin denoted B in the potential energy surface (PES). Consider the case where four atomic processes from the basin B are found by performing basin escape pathway search (BEPS). (b) By repeating this procedure for other basins, we obtain a “kinetic map” of the PES, which we will term the KMC model. This KMC model consists of a list of states and atomic processes found. In addition, the process rates, the time for which the system resides in each basin, and the number of times each process has been witnessed in the BEPS calculations are also stored.

(a) Schematic of a basin denoted B in the potential energy surface (PES). Consider the case where four atomic processes from the basin B are found by performing basin escape pathway search (BEPS). (b) By repeating this procedure for other basins, we obtain a “kinetic map” of the PES, which we will term the KMC model. This KMC model consists of a list of states and atomic processes found. In addition, the process rates, the time for which the system resides in each basin, and the number of times each process has been witnessed in the BEPS calculations are also stored.

Schematic of number of times processes are observed when BEPS calculations are performed in a particular basin. The x-axis denotes the time tB spent in the basin with BEPS. Numbers inside the colored vertical bands denote the process index; multiple sightings of a process are represented by multiple circles. It is likely that processes belonging to the inaccessible timescales and some of the processes from the accessible timescales are missing from the catalog.

Schematic of number of times processes are observed when BEPS calculations are performed in a particular basin. The x-axis denotes the time tB spent in the basin with BEPS. Numbers inside the colored vertical bands denote the process index; multiple sightings of a process are represented by multiple circles. It is likely that processes belonging to the inaccessible timescales and some of the processes from the accessible timescales are missing from the catalog.

Missing rate from a catalog CK for a basin that contains one process with rate constant 109 s−1. The catalog CK is initially empty. Processes from the basin are found by sampling escape pathways from the basin. The rate estimate for the inaccessible timescales (dashed line; Eq. (11) ) decreases as the time tB spent in the basin increases. Circles denote the time at which the process was first observed (shown for 100 independent catalog generation calculations). The solid line denotes the correct unknown rate for one such catalog.

Missing rate from a catalog CK for a basin that contains one process with rate constant 109 s−1. The catalog CK is initially empty. Processes from the basin are found by sampling escape pathways from the basin. The rate estimate for the inaccessible timescales (dashed line; Eq. (11) ) decreases as the time tB spent in the basin increases. Circles denote the time at which the process was first observed (shown for 100 independent catalog generation calculations). The solid line denotes the correct unknown rate for one such catalog.

Missing rate for a basin with Np = 100 processes. Each process has a rate constant 109 s−1. Missing processes are observed as time tB elapsed in the basin increases. The rate estimate from the inaccessible timescales (dashed line; Eq. (11) ) is smaller than the correct unknown rate kU. When contributions from the accessible timescale spectral band are added to the ones from the inaccessible timescales the rate estimate (blue line) is very close to kU. Symbols denote the times when the rates were computed. Here, np denotes the number of known processes.

Missing rate for a basin with Np = 100 processes. Each process has a rate constant 109 s−1. Missing processes are observed as time tB elapsed in the basin increases. The rate estimate from the inaccessible timescales (dashed line; Eq. (11) ) is smaller than the correct unknown rate kU. When contributions from the accessible timescale spectral band are added to the ones from the inaccessible timescales the rate estimate (blue line) is very close to kU. Symbols denote the times when the rates were computed. Here, np denotes the number of known processes.

Number of processes observed m times when the catalog was generated in Fig. 4 after a total of mt = 50 escapes from the basin. The value of that results in the least sum of squared error with respect to the data from the histogram is regarded as an estimate for the number of processes in a spectral band.

Number of processes observed m times when the catalog was generated in Fig. 4 after a total of mt = 50 escapes from the basin. The value of that results in the least sum of squared error with respect to the data from the histogram is regarded as an estimate for the number of processes in a spectral band.

Sum of squared errors (SSE) obtained from a catalog being generated. Three complete catalogs with the number of processes Np being (a) 10, (b) 100, and (c) 1000 processes were considered (Np is indicated by the dashed vertical line). All processes have identical rate constant given by k = 109 s−1. SSE is plotted for different values of the parameter in Eq. (17) . The value of that gives the smallest SSE is the best estimate for the number of processes in the spectral band.

Sum of squared errors (SSE) obtained from a catalog being generated. Three complete catalogs with the number of processes Np being (a) 10, (b) 100, and (c) 1000 processes were considered (Np is indicated by the dashed vertical line). All processes have identical rate constant given by k = 109 s−1. SSE is plotted for different values of the parameter in Eq. (17) . The value of that gives the smallest SSE is the best estimate for the number of processes in the spectral band.

Histogram for from different catalogs generated for a basin that contains Np = 100 processes, each having a rate constant k = 109 s−1. Histogram was obtained after mt number of escapes occurred.

Histogram for from different catalogs generated for a basin that contains Np = 100 processes, each having a rate constant k = 109 s−1. Histogram was obtained after mt number of escapes occurred.

Estimate for the missing rate for a catalog generated with BEPS. The catalog contains (a) Np = 100 and (b) 1000 processes. Each process has a rate constant k = 109 s−1. Black line in panels (a) and (b) denotes the correct unknown rate kU. Red dashed and blue solid lines denote rate estimate from Eqs. (11) and (22) , respectively. Symbols denote the time when the rate estimate was obtained. The estimate is non-zero even though kU becomes zero after some time. (c) Validity times for catalog generated in panels (a) and (b) using δ = 0.1.

Estimate for the missing rate for a catalog generated with BEPS. The catalog contains (a) Np = 100 and (b) 1000 processes. Each process has a rate constant k = 109 s−1. Black line in panels (a) and (b) denotes the correct unknown rate kU. Red dashed and blue solid lines denote rate estimate from Eqs. (11) and (22) , respectively. Symbols denote the time when the rate estimate was obtained. The estimate is non-zero even though kU becomes zero after some time. (c) Validity times for catalog generated in panels (a) and (b) using δ = 0.1.

Estimate for missing rate from a catalog CK generated for a basin using BEPS. Two spectral bands are present. The first band contains Np1 = 100 processes with individual rate constant k1 = 109 s−1, while the second band contains Np2 = 100 processes with rate constant (a) k2 = 106 s−1, (b) k2 = 107 s−1, and (c) k2 = 108 s−1. Black line denotes the correct unknown rate. Dashed red and solid blue lines denote estimate from Eqs. (11) and (23) . Symbols denote the times where the unknown rate was computed.

Estimate for missing rate from a catalog CK generated for a basin using BEPS. Two spectral bands are present. The first band contains Np1 = 100 processes with individual rate constant k1 = 109 s−1, while the second band contains Np2 = 100 processes with rate constant (a) k2 = 106 s−1, (b) k2 = 107 s−1, and (c) k2 = 108 s−1. Black line denotes the correct unknown rate. Dashed red and solid blue lines denote estimate from Eqs. (11) and (23) . Symbols denote the times where the unknown rate was computed.

Estimate for missing rate for a particular basin with (a) Np = 100 and (b) Np = 1000 processes. Rate constants and the three spectral bands to which the rates belong for panel (a) are shown in the inset by the vertical lines and the shaded area, respectively. Black line denotes the correct unknown rate. Dashed red and solid blue lines denote estimate from Eqs. (11) and (23) .

Estimate for missing rate for a particular basin with (a) Np = 100 and (b) Np = 1000 processes. Rate constants and the three spectral bands to which the rates belong for panel (a) are shown in the inset by the vertical lines and the shaded area, respectively. Black line denotes the correct unknown rate. Dashed red and solid blue lines denote estimate from Eqs. (11) and (23) .

(a) Number of processes observed for catalog in Fig. 10 with time tB spent in the basin. All processes known once times in the shaded area are reached. (b) Validity time for the catalog as it is being generated. (c) Average observed error δobs is less than the target error δ = 0.1. Averaging was performed over 100 independent KMC runs.

(a) Number of processes observed for catalog in Fig. 10 with time tB spent in the basin. All processes known once times in the shaded area are reached. (b) Validity time for the catalog as it is being generated. (c) Average observed error δobs is less than the target error δ = 0.1. Averaging was performed over 100 independent KMC runs.

Histogram for average observed error δobs for different values of target error δ (a) 0.001, (b) 0.01, and (c) 0.1 (shown as percentage in figure) for the catalog in Fig. 10 . The shaded area shows cases where the error does not exceed δ. Out of 100 different catalogs generated, the number of catalogs that resulted in more than δ was 14, 7, and 6 times for panels (a), (b), and (c), respectively.

Histogram for average observed error δobs for different values of target error δ (a) 0.001, (b) 0.01, and (c) 0.1 (shown as percentage in figure) for the catalog in Fig. 10 . The shaded area shows cases where the error does not exceed δ. Out of 100 different catalogs generated, the number of catalogs that resulted in more than δ was 14, 7, and 6 times for panels (a), (b), and (c), respectively.

Article metrics loading...

Full text loading...

Commenting has been disabled for this content