^{1,a),b)}, Mano Ram Maurya

^{2,a),c)}, Daniel M. Tartakovsky

^{1,d)}and Shankar Subramaniam

^{2,3,e)}

### Abstract

Many biochemical processes at the sub-cellular level involve a small number of molecules. The local numbers of these molecules vary in space and time, and exhibit random fluctuations that can only be captured with stochastic simulations. We present a novel stochastic operator-splitting algorithm to model such reaction-diffusion phenomena. The reaction and diffusion steps employ stochastic simulation algorithms and Brownian dynamics, respectively. Through theoretical analysis, we have developed an algorithm to identify if the system is reaction-controlled, diffusion-controlled or is in an intermediate regime. The time-step size is chosen accordingly at each step of the simulation. We have used three examples to demonstrate the accuracy and robustness of the proposed algorithm. The first example deals with diffusion of two chemical species undergoing an irreversible bimolecular reaction. It is used to validate our algorithm by comparing its results with the solution obtained from a corresponding deterministic partial differential equation at low and high number of molecules. In this example, we also compare the results from our method to those obtained using a Gillespie multi-particle (GMP) method. The second example, which models simplified RNA synthesis, is used to study the performance of our algorithm in reaction- and diffusion-controlled regimes and to investigate the effects of local inhomogeneity. The third example models reaction-diffusion of CheY molecules through the cytoplasm of Escherichia coli during chemotaxis. It is used to compare the algorithm's performance against the GMP method. Our analysis demonstrates that the proposed algorithm enables accurate simulation of the kinetics of complex and spatially heterogeneous systems. It is also computationally more efficient than commonly used alternatives, such as the GMP method.

We acknowledge the UCSD Triton Resource of San Diego Supercomputer Center (SDSC) used in this work. This research was supported by the National Heart, Lung and Blood Institute (NHLBI) Grant No. 5 R33 HL087375-02; National Science Foundation (NSF) Grant Nos. DBI-0641037, DBI-0835541, and STC-0939370; and by the Department of Energy (DOE) Office of Science, Advanced Scientific Computing Research (ASCR) program in Applied Mathematical Sciences.

I. INTRODUCTION

II. METHODS: NUMERICAL APPROACH

A. Operator-splitting method

B. Algorithm for the stochastic operator-splitting method

1. Dynamic identification of system's state

2. Algorithm

III. RESULTS: CASE STUDIES

A. Synthetic reaction-diffusion case study

1. Performance analysis

2. Effect of number of molecules

B. Gene expression case study

1. Reaction- vs. diffusion-limited processes

2. Time-step selection

3. Comparison with GMP method and stochastic effect

C. CheY diffusion case study

1. System description

2. Simulation results

IV. SUMMARY AND DISCUSSION

### Key Topics

- Diffusion
- 61.0
- Brownian dynamics
- 24.0
- Cellular automata
- 17.0
- Partial differential equations
- 16.0
- Biochemical reactions
- 14.0

##### C12

## Figures

(a) Schematic representation of the diffusion-reaction operator-splitting. The final value after diffusion process at time t + Δt is used as the initial value for the reaction process. Final value of reaction process is the final value at the end of diffusion-reaction process. (b) and (c) Cellular automata neighborhoods in d = 2 dimension: in the von Neumann automata the probability of staying in a cell or diffusing to its neighbors is 1/5 (b), in the Moore automata this probability is 1/9 (c).

(a) Schematic representation of the diffusion-reaction operator-splitting. The final value after diffusion process at time t + Δt is used as the initial value for the reaction process. Final value of reaction process is the final value at the end of diffusion-reaction process. (b) and (c) Cellular automata neighborhoods in d = 2 dimension: in the von Neumann automata the probability of staying in a cell or diffusing to its neighbors is 1/5 (b), in the Moore automata this probability is 1/9 (c).

(a) Histogram of ln (1/r), where r is a uniformly distributed random variable in [0,1]. (b) Cumulative fraction of counts out of total counts (one million). About 63% of the numbers have values less than 1 and 86% of the numbers are less than 2.

(a) Histogram of ln (1/r), where r is a uniformly distributed random variable in [0,1]. (b) Cumulative fraction of counts out of total counts (one million). About 63% of the numbers have values less than 1 and 86% of the numbers are less than 2.

A + B → C case study: (a) Initially, species A and B exist only in left-hand side. All A and B molecules and their product P diffuse with the same diffusion constant. (b) Comparison of results from analytical solution, cellular automata (CA) and Brownian dynamics (BD). The Brownian dynamics results agree with the analytical solution, while the cellular automata results do not. The increasing curves represent the number of molecules in the right half of the domain and the decreasing ones in the left half.

A + B → C case study: (a) Initially, species A and B exist only in left-hand side. All A and B molecules and their product P diffuse with the same diffusion constant. (b) Comparison of results from analytical solution, cellular automata (CA) and Brownian dynamics (BD). The Brownian dynamics results agree with the analytical solution, while the cellular automata results do not. The increasing curves represent the number of molecules in the right half of the domain and the decreasing ones in the left half.

A + B → C case study: Effect of diffusion constant, D (m ^{2}/s), on (a) τ D (or Δt) and (b) computational time for our method and the GMP method. As D increases, τ D (or Δt) decreases and computational time increases for both algorithms. For D = 10^{−14} m^{2}/s, the system transitions from diffusion-controlled (Δt = k 2τ D ; k 2 = 2) to reaction-controlled regime during the time-course. For D ⩾ 10^{−13} m^{2}/s, the system becomes reaction-controlled (Δt = 10τ D ), explaining the increase in the absolute value of the slope of Δt or computational time vs. D plots at D = 10^{−13} m^{2}/s for our method.

A + B → C case study: Effect of diffusion constant, D (m ^{2}/s), on (a) τ D (or Δt) and (b) computational time for our method and the GMP method. As D increases, τ D (or Δt) decreases and computational time increases for both algorithms. For D = 10^{−14} m^{2}/s, the system transitions from diffusion-controlled (Δt = k 2τ D ; k 2 = 2) to reaction-controlled regime during the time-course. For D ⩾ 10^{−13} m^{2}/s, the system becomes reaction-controlled (Δt = 10τ D ), explaining the increase in the absolute value of the slope of Δt or computational time vs. D plots at D = 10^{−13} m^{2}/s for our method.

A + B → C case study: (a) and (b) As cell size becomes finer, the results from our approach converge to the numerical solution. L is the number of cells along each direction (larger L represents finer cell size). The vertical axis shows the number of C molecules produced in the left half of the domain. The results are based on average of 8 realizations. (a) shows the results of reaction-controlled system, whereas (b) is for a diffusion-controlled system. (c) and (d) Dashed line is the result of reaction-first and diffusion-later order and dotted line is the reverse order. Diffusion-reaction ordering has better agreement with PDE solution than the reaction-diffusion order. It is because molecules cannot diffuse during the time-step if we treat reaction first. This ordering is effective for both diffusion-controlled and reaction-controlled systems. (e) Black line denotes the result of PDE solution. Three gray lines are for the GMP method. In the GMP method, time-step Δt is related to the cell size. So, it takes longer time to simulate the system with finer cell size. (f) As the initial number of molecules gets lower, the difference rate between deterministic solution and our stochastic solution increases. The fluctuations also increase. This result proves that the stochastic solution approaches the deterministic solution as the number of molecules (or concentration if volume is fixed) increases. In other words, randomness or stochasticity is less important at higher concentrations.

A + B → C case study: (a) and (b) As cell size becomes finer, the results from our approach converge to the numerical solution. L is the number of cells along each direction (larger L represents finer cell size). The vertical axis shows the number of C molecules produced in the left half of the domain. The results are based on average of 8 realizations. (a) shows the results of reaction-controlled system, whereas (b) is for a diffusion-controlled system. (c) and (d) Dashed line is the result of reaction-first and diffusion-later order and dotted line is the reverse order. Diffusion-reaction ordering has better agreement with PDE solution than the reaction-diffusion order. It is because molecules cannot diffuse during the time-step if we treat reaction first. This ordering is effective for both diffusion-controlled and reaction-controlled systems. (e) Black line denotes the result of PDE solution. Three gray lines are for the GMP method. In the GMP method, time-step Δt is related to the cell size. So, it takes longer time to simulate the system with finer cell size. (f) As the initial number of molecules gets lower, the difference rate between deterministic solution and our stochastic solution increases. The fluctuations also increase. This result proves that the stochastic solution approaches the deterministic solution as the number of molecules (or concentration if volume is fixed) increases. In other words, randomness or stochasticity is less important at higher concentrations.

Gene expression case study: (a) Dash-dotted line shows the result of Gillespie algorithm which deals with only reaction process. For D = 10^{−12} m^{2}/s with L = 5, our results are similar to those obtained with the Gillespie algorithm, because it is a reaction-limited process so that diffusion does not have serious impact on the system. On the contrary, the case of D = 10^{−15} m^{2}/s with L = 5 exhibits long time lag to reach the steady-state value due to diffusion effect and has much larger fluctuations. (b) In case of L = 20, the mesh is much finer than the above cases. The results are similar to those in (a) as our algorithm is able to adjust time-steps according to the system characteristics even though L is increased from 5 to 20.

Gene expression case study: (a) Dash-dotted line shows the result of Gillespie algorithm which deals with only reaction process. For D = 10^{−12} m^{2}/s with L = 5, our results are similar to those obtained with the Gillespie algorithm, because it is a reaction-limited process so that diffusion does not have serious impact on the system. On the contrary, the case of D = 10^{−15} m^{2}/s with L = 5 exhibits long time lag to reach the steady-state value due to diffusion effect and has much larger fluctuations. (b) In case of L = 20, the mesh is much finer than the above cases. The results are similar to those in (a) as our algorithm is able to adjust time-steps according to the system characteristics even though L is increased from 5 to 20.

Gene expression case study: (a) The result of GMP algorithm for various L and the corresponding Δt (=τ D ) values. Diffusion constant has a fixed value, D = 10^{−15} m^{2}/s. For L = 20 ( s and τ D = 0.417 s), the number of P molecules does not reach its steady-state value of around 1000 because (Table III ). This means reactions cannot fully take place during Δt. (b) Our algorithm performs well for both cases of L because it classifies the system as diffusion- or reaction-controlled and decides the appropriate time-steps accordingly. (c) Fluctuations become smaller as the number of realizations become larger. In comparison to the Gillespie algorithm, our method has much higher fluctuations because it considers spatial randomness as well as randomness due to the small number of molecules.

Gene expression case study: (a) The result of GMP algorithm for various L and the corresponding Δt (=τ D ) values. Diffusion constant has a fixed value, D = 10^{−15} m^{2}/s. For L = 20 ( s and τ D = 0.417 s), the number of P molecules does not reach its steady-state value of around 1000 because (Table III ). This means reactions cannot fully take place during Δt. (b) Our algorithm performs well for both cases of L because it classifies the system as diffusion- or reaction-controlled and decides the appropriate time-steps accordingly. (c) Fluctuations become smaller as the number of realizations become larger. In comparison to the Gillespie algorithm, our method has much higher fluctuations because it considers spatial randomness as well as randomness due to the small number of molecules.

CheY diffusion case study: (a) E. coli has length [2.48 0.88 0.88] μm along x, y, and z direction. R denotes receptor cluster located on the anterior wall and is represented by [x_min x_max; y_min y_max; z_min z_max] = [0 0.08; 0.16 0.64; 0.16 0.64] μm. CheY molecules are phosphorylated in the receptor cluster and diffuse into the cytoplasm. M1∼M4 show the location of the four flagellar motors on side walls and are located in M1 = [0.48 0.56; 0.40 0.48; 0 0.08] μm, M2 = [0.96 1.04; 0 0.08; 0.40 0.48] μm, M3 = [1.44 1.52; 0.40 0.48; 0.80 0.88] μm, and M4 = [1.92 2.00; 0.80 0.88; 0.40 0.48] μm. The remaining domain is considered as cytoplasm. (b) and (c) The simulation results from both the GMP method and our method are in good agreement although there are some differences in rise-time because the GMP method and our method use cellular automata and Brownian dynamics, respectively, to model the diffusion process (as explained in Fig. 3 ).

CheY diffusion case study: (a) E. coli has length [2.48 0.88 0.88] μm along x, y, and z direction. R denotes receptor cluster located on the anterior wall and is represented by [x_min x_max; y_min y_max; z_min z_max] = [0 0.08; 0.16 0.64; 0.16 0.64] μm. CheY molecules are phosphorylated in the receptor cluster and diffuse into the cytoplasm. M1∼M4 show the location of the four flagellar motors on side walls and are located in M1 = [0.48 0.56; 0.40 0.48; 0 0.08] μm, M2 = [0.96 1.04; 0 0.08; 0.40 0.48] μm, M3 = [1.44 1.52; 0.40 0.48; 0.80 0.88] μm, and M4 = [1.92 2.00; 0.80 0.88; 0.40 0.48] μm. The remaining domain is considered as cytoplasm. (b) and (c) The simulation results from both the GMP method and our method are in good agreement although there are some differences in rise-time because the GMP method and our method use cellular automata and Brownian dynamics, respectively, to model the diffusion process (as explained in Fig. 3 ).

## Tables

The tunable parameter k 1 is used as a criterion to decide if the system is diffusion- or reaction-controlled. As the probability of τ R being less than τ D increases, the system becomes more diffusion-controlled. The other parameter, k 2, is related to the probability of a reaction taking place during Δt. As k 2 increases, the probability of a reaction occurrence during Δt increases. In our algorithm, k 1 = 0.5, , k 2 = 2, and are used. Please refer to Fig. 2 .

The tunable parameter k 1 is used as a criterion to decide if the system is diffusion- or reaction-controlled. As the probability of τ R being less than τ D increases, the system becomes more diffusion-controlled. The other parameter, k 2, is related to the probability of a reaction taking place during Δt. As k 2 increases, the probability of a reaction occurrence during Δt increases. In our algorithm, k 1 = 0.5, , k 2 = 2, and are used. Please refer to Fig. 2 .

A synthetic reaction-diffusion system A + B → C with diffusion constant D = 10^{−12} m^{2}/s. The system is reaction-controlled for L = 4 and 8. In these cases, we set the average value of all Δt to . Computational time increases with smaller Δt or larger L (smaller cell size). For L = 16, the system undergoes transitions between the mixed and diffusion-controlled regimes. In this case, . As L increases, τ D (or Δt) decreases and computational time increases for both algorithms. However, for any L, our method is faster than the GMP method. The relative error-rate is also shown (see Eq. (12) and related text). As L increases, the relative error-rate decreases for both methods. However, our algorithm is more accurate than the GMP method. In a similar way, for a given L, as D increases, τ D (or Δt) decreases, and computational time increases for both algorithms.

A synthetic reaction-diffusion system A + B → C with diffusion constant D = 10^{−12} m^{2}/s. The system is reaction-controlled for L = 4 and 8. In these cases, we set the average value of all Δt to . Computational time increases with smaller Δt or larger L (smaller cell size). For L = 16, the system undergoes transitions between the mixed and diffusion-controlled regimes. In this case, . As L increases, τ D (or Δt) decreases and computational time increases for both algorithms. However, for any L, our method is faster than the GMP method. The relative error-rate is also shown (see Eq. (12) and related text). As L increases, the relative error-rate decreases for both methods. However, our algorithm is more accurate than the GMP method. In a similar way, for a given L, as D increases, τ D (or Δt) decreases, and computational time increases for both algorithms.

Gene expression case study: Reaction time is averaged over 256 realizations of a simplified gene expression process. As cell sizes become smaller reaction times increase, since fewer molecules in each cell imply lower probability for reactions to take place within a cell. The time-step Δt in the GMP method equals τ D , whereas Δt in our method can vary according to the system classification as reaction- or diffusion-controlled. Since the cases of L = 5 and 10 are diffusion-controlled, we set . For L = 20, the system changes from diffusion- to reaction-controlled as time progresses.

Gene expression case study: Reaction time is averaged over 256 realizations of a simplified gene expression process. As cell sizes become smaller reaction times increase, since fewer molecules in each cell imply lower probability for reactions to take place within a cell. The time-step Δt in the GMP method equals τ D , whereas Δt in our method can vary according to the system classification as reaction- or diffusion-controlled. Since the cases of L = 5 and 10 are diffusion-controlled, we set . For L = 20, the system changes from diffusion- to reaction-controlled as time progresses.

Article metrics loading...

Full text loading...

Commenting has been disabled for this content