^{1}, Eugene I. Shakhnovich

^{2,a)}and Patrícia F. N. Faísca

^{1,a)}

### Abstract

We performed extensive lattice Monte Carlo simulations of ribosome-bound stalled nascent chains (RNCs) to explore the relative roles of native topology and non-native interactions in co-translational folding of small proteins. We found that the formation of a substantial part of the native structure generally occurs towards the end of protein synthesis. However, multi-domain structures, which are rich in local interactions, are able to develop gradually during chain elongation, while those with proximate chain termini require full protein synthesis to fold. A detailed assessment of the conformational ensembles populated by RNCs with different lengths reveals that the directionality of protein synthesis has a fine-tuning effect on the probability to populate low-energy conformations. In particular, if the participation of non-native interactions in folding energetics is mild, the formation of native-like conformations is majorly determined by the properties of the contact map around the tethering terminus. Likewise, a pair of RNCs differing by only 1-2 residues can populate structurally well-resolved low energy conformations with significantly different probabilities. An interesting structural feature of these low-energy conformations is that, irrespective of native structure, their non-native interactions are always long-ranged and marginally stabilizing. A comparison between the conformational spectra of RNCs and chain fragments folding freely in the bulk reveals drastic changes amongst the two set-ups depending on the native structure. Furthermore, they also show that the ribosome may enhance (up to 20%) the population of low energy conformations for chains folding to native structures dominated by local interactions. In contrast, a RNC folding to a non-local topology is forced to remain largely unstructured but can attain low energy conformations in bulk.

P.F.N.F. and H.K. thank the Fundação para a Ciência e a Tecnologia for financial support through Grant No. PTDC/FIS/113638/2009 (to the author P.F.N.F.).

INTRODUCTION

MODELS AND METHODS

The simple lattice model

The Gō potential

The sequence-specific potential

Sequence design

Monte Carlo simulation

Simulation setup

Statistical error

RESULTS

Topological determinants of co-translational folding: The importance of modular structure and location of chain termini

Sequence specificity in co-translational folding: Role of non-native interactions in structure development

Directionality of protein synthesis and conformational search in co-translational folding

Structural characterization of conformations populated during co-translational folding: Stabilizing non-native interactions are typically long-ranged

Comparison with bulk experiments: How the ribosome affects conformational search during co-translational folding

CONCLUSIONS

### Key Topics

- Proteins
- 80.0
- Protein folding
- 56.0
- Topology
- 40.0
- Conformational dynamics
- 21.0
- Monte Carlo methods
- 15.0

## Figures

(a) Cartoon representation of the E. coli ribosome highlighting the exit tunnel and the vectorial character of protein synthesis that proceeds from the N-terminus to the C-terminus that is tethered at the PTC on the larger ribosomal unit. (b) Simulation setup used in this work. RNCs of increasing chain length represented on the cubic lattice are tethered to a planar surface modeling the ribosome by a linker with the size of one lattice spacing. (c) RNC with chain length 44. The lattice model does not distinguish between the N- and C-terminus and, therefore, the MC simulations are performed with either the first or the last bead free in the role of the N-terminus, which is the first residue to be extruded from the exit tunnel. A completely synthesized chain with residue 1 as N-terminus will be tethered at residue 48 and vice versa.

(a) Cartoon representation of the E. coli ribosome highlighting the exit tunnel and the vectorial character of protein synthesis that proceeds from the N-terminus to the C-terminus that is tethered at the PTC on the larger ribosomal unit. (b) Simulation setup used in this work. RNCs of increasing chain length represented on the cubic lattice are tethered to a planar surface modeling the ribosome by a linker with the size of one lattice spacing. (c) RNC with chain length 44. The lattice model does not distinguish between the N- and C-terminus and, therefore, the MC simulations are performed with either the first or the last bead free in the role of the N-terminus, which is the first residue to be extruded from the exit tunnel. A completely synthesized chain with residue 1 as N-terminus will be tethered at residue 48 and vice versa.

(a)–(d) Dependence of the average number of contacts (total, native and non-native) (normalized to the total number of native contacts, C N = 57) on the length of the RNC (normalized to the length of the completely synthesized chain) for topology 1 in the two considered set ups (i.e., bead 1 as N-terminal (circles) and bead 48 as N-terminal (triangles)). The blue curve represents the averaged fraction of established native contacts, Q. Sequence 1 is not robustly folded into topology 1 at l = 1 because surface tethering enhances the population of compact intermediate states with both native- and non-native-like features. 46 The thin lines shown in the plots indicate the maximum number of native interactions that may be established in each considered chain fragment normalized to the total number of contacts in the native conformation; we term them theoretical lines. For the theoretical maximum to be attained the incomplete chain must adopt the same conformation it adopts in the native structure. The dotted line indicates fragment length l = 0.92.

(a)–(d) Dependence of the average number of contacts (total, native and non-native) (normalized to the total number of native contacts, C N = 57) on the length of the RNC (normalized to the length of the completely synthesized chain) for topology 1 in the two considered set ups (i.e., bead 1 as N-terminal (circles) and bead 48 as N-terminal (triangles)). The blue curve represents the averaged fraction of established native contacts, Q. Sequence 1 is not robustly folded into topology 1 at l = 1 because surface tethering enhances the population of compact intermediate states with both native- and non-native-like features. 46 The thin lines shown in the plots indicate the maximum number of native interactions that may be established in each considered chain fragment normalized to the total number of contacts in the native conformation; we term them theoretical lines. For the theoretical maximum to be attained the incomplete chain must adopt the same conformation it adopts in the native structure. The dotted line indicates fragment length l = 0.92.

(a)–(d) Dependence of the average number of contacts – total, native, or non-native – (normalized to the total number of native contacts C N = 57) on the length of the RNC (normalized to the length of the completely synthesized chain) for topology 2 in the two considered set ups (i.e., bead 1 playing the role of N-terminal (circles) and bead 48 playing the role of N-terminal (triangles)). The blue curve represents the averaged fraction of established native contacts, Q.

(a)–(d) Dependence of the average number of contacts – total, native, or non-native – (normalized to the total number of native contacts C N = 57) on the length of the RNC (normalized to the length of the completely synthesized chain) for topology 2 in the two considered set ups (i.e., bead 1 playing the role of N-terminal (circles) and bead 48 playing the role of N-terminal (triangles)). The blue curve represents the averaged fraction of established native contacts, Q.

Probability distribution function of conformational energy E, illustrating the effect of sequence specificity on the conformational ensemble explored by RNC with chain size l = 0.92 (corresponding to 44 beads) in the considered folding setups. (a) Strongly participating non-native interactions (as in sequence 1 folding to topology 1 (T1), and sequence 1 folding to topology 2 (T2)) broaden the distribution of accessible energies but impede sampling of low-energy conformations. (b) Low energy conformations are only populated by the sequences for which non-native interactions are less conspicuous during folding (i.e., sequence 2 folding to T1 and sequence 2 folding to T2).

Probability distribution function of conformational energy E, illustrating the effect of sequence specificity on the conformational ensemble explored by RNC with chain size l = 0.92 (corresponding to 44 beads) in the considered folding setups. (a) Strongly participating non-native interactions (as in sequence 1 folding to topology 1 (T1), and sequence 1 folding to topology 2 (T2)) broaden the distribution of accessible energies but impede sampling of low-energy conformations. (b) Low energy conformations are only populated by the sequences for which non-native interactions are less conspicuous during folding (i.e., sequence 2 folding to T1 and sequence 2 folding to T2).

Characterization of low energy conformational states populated by incomplete sequences with chain size l = 0.92 (corresponding to 44 beads). The probability distribution function (PDF) for the number of native and non-native contacts in ensembles of conformations with energy E ∼ −20 populated by sequence 1 folding to topology 1 when bead 1 (a) or bead 48 (b) plays the role of the N-terminus (colored in blue). The conformation shown at the right hand side of the PDF has number of native interactions in the mode of the corresponding distribution and a high number of established non-native interactions. In particular, the conformation shown in (a) has fraction of native contacts Q = 0.65 and eight non-native interactions (of absolute CO = 9.4) that represent 9% of the conformation's total energy. The conformation in (b) has Q = 0.52 and 19 non-native interactions (of absolute CO = 18), which represent 4% of the conformation's total energy. Panel (c) shows the unique conformation populated by (incomplete) sequence 2 folding to topology 1 when the N-terminus is represented by bead 1 at energy E = −20.5; this conformation has Q = 0.80 and three non-native long-ranged interactions (of absolute CO = 23.7) that are marginally stabilizing, representing 1% of the conformation's total energy; panel (d) shows the PDF for native and non-native contacts (at E = −20.5) when the N-terminal is represented by bead 48. The representative conformation has Q = 0.81 and four stabilizing (5%) long-ranged non-native contacts (of absolute CO = 22.5). In the representative conformations the beads that are in their native conformation are colored red or green according to the structural module they pertain in the native structure (e).

Characterization of low energy conformational states populated by incomplete sequences with chain size l = 0.92 (corresponding to 44 beads). The probability distribution function (PDF) for the number of native and non-native contacts in ensembles of conformations with energy E ∼ −20 populated by sequence 1 folding to topology 1 when bead 1 (a) or bead 48 (b) plays the role of the N-terminus (colored in blue). The conformation shown at the right hand side of the PDF has number of native interactions in the mode of the corresponding distribution and a high number of established non-native interactions. In particular, the conformation shown in (a) has fraction of native contacts Q = 0.65 and eight non-native interactions (of absolute CO = 9.4) that represent 9% of the conformation's total energy. The conformation in (b) has Q = 0.52 and 19 non-native interactions (of absolute CO = 18), which represent 4% of the conformation's total energy. Panel (c) shows the unique conformation populated by (incomplete) sequence 2 folding to topology 1 when the N-terminus is represented by bead 1 at energy E = −20.5; this conformation has Q = 0.80 and three non-native long-ranged interactions (of absolute CO = 23.7) that are marginally stabilizing, representing 1% of the conformation's total energy; panel (d) shows the PDF for native and non-native contacts (at E = −20.5) when the N-terminal is represented by bead 48. The representative conformation has Q = 0.81 and four stabilizing (5%) long-ranged non-native contacts (of absolute CO = 22.5). In the representative conformations the beads that are in their native conformation are colored red or green according to the structural module they pertain in the native structure (e).

The probability distribution function for conformational energy, E, sampled by incomplete sequences (l = 0.92) folding to topology 1 (left) and topology 2 (right) in the bulk and in a RNC with bead 1 playing the role of N-terminus. The effect of a nearby surface on conformational sampling is clearly dependent on native structure. In particular, sequences folding to T1 (a) and (b) explore very similar ensembles of conformations in both bulk and surface-tethered setups. However, the probability that these sequences populate low energy conformations increases (by 33% for S1 and 12% for S2) upon surface tethering. On the other hand, sequences folding to topology 2, exhibit remarkable changes in conformational sampling upon surface tethering. In this case the presence of a tethering surface biases conformational sampling towards the unfolded state, while free chains folding in bulk are able to populate low energy conformations with non-negligible (c) to high probabilities (d). The presented distributions are based on one statistically representative dataset out of five independent datasets used to compute data reported in Table I .

The probability distribution function for conformational energy, E, sampled by incomplete sequences (l = 0.92) folding to topology 1 (left) and topology 2 (right) in the bulk and in a RNC with bead 1 playing the role of N-terminus. The effect of a nearby surface on conformational sampling is clearly dependent on native structure. In particular, sequences folding to T1 (a) and (b) explore very similar ensembles of conformations in both bulk and surface-tethered setups. However, the probability that these sequences populate low energy conformations increases (by 33% for S1 and 12% for S2) upon surface tethering. On the other hand, sequences folding to topology 2, exhibit remarkable changes in conformational sampling upon surface tethering. In this case the presence of a tethering surface biases conformational sampling towards the unfolded state, while free chains folding in bulk are able to populate low energy conformations with non-negligible (c) to high probabilities (d). The presented distributions are based on one statistically representative dataset out of five independent datasets used to compute data reported in Table I .

## Tables

Comparison of the frequency of low-energy conformations in bulk and surface-tethered nascent chains at l 1 = 0.92 and l 2 = 0.98. R denotes the ratio of the number of surface-sampled configurations over the number of bulk-sampled configurations for a particular topology-sequence (T-S) pair. In all cases, the first residue plays the role of the N-terminal. The presented data result from averaging over five independent simulations.

Comparison of the frequency of low-energy conformations in bulk and surface-tethered nascent chains at l 1 = 0.92 and l 2 = 0.98. R denotes the ratio of the number of surface-sampled configurations over the number of bulk-sampled configurations for a particular topology-sequence (T-S) pair. In all cases, the first residue plays the role of the N-terminal. The presented data result from averaging over five independent simulations.

Article metrics loading...

Full text loading...

Commenting has been disabled for this content