A closer look into the α-helix basin

α-Helices are the most abundant structures found within proteins and play an important role in the determination of the global structure of proteins and their function. Representation of α-helical structures with the common (φ, ψ) dihedrals, as in Ramachandran maps, does not provide informative details regarding the helical structure apart for the abstract geometric meaning of the dihedrals. We present an alternative coordinate system that describes helical conformations in terms of residues per turn (ρ) and angle (ϑ) between backbone carbonyls relative to the helix direction through an approximate linear transformation between the two coordinates system (φ, ψ and ρ, ϑ). In this way, valuable information on the helical structure becomes directly available. Analysis of α-helical conformations acquired from the Protein Data Bank (PDB) demonstrates that a conformational energy function of the α-helix backbone can be harmonically approximated on the (ρ, ϑ) space, which is not applicable to the (φ, ψ) space due to the diagonal distribution of the conformations. The observed trends of helical conformations obtained from the PDB are captured by four conceptual simulations that theoretically examine the effects of residue bulkiness, external electric field, and externally applied mechanical forces. Flory’s isolated pair hypothesis is shown to be partially correct for α-helical conformations.

Scientific RepoRts | 6:38341 | DOI: 10.1038/srep38341 Following these advances, many efforts have been invested in the study of α -helices, however, little was done to understand the conformations of α -helices within the α -helix basin. Thus, the purpose of this study is to provide a deeper look into the different conformations of α -helices.

Results
We wish to find a mathematical relation between (ϕ, ψ) and (ρ, ϑ) coordinate systems, where (ϕ, ψ) are the commonly used dihedral angles, ρ is the number of residues per single turn of the α -helix, and ϑ is the angle between a backbone carbonyl (CO) normal of the given residue relative to the normal of the α -helix direction (ϑ is positive for CO normal pointing outwards from the α -helix center) as illustrated in Fig. 1a. The relation between ρ and ϑ as a function of (ϕ, ψ) is shown in Fig. 1c and d, respectively. The analytical determination of the α -helix basin was done by using the HB alignment score S, such that regions with S > 0 are treated as the α -helix basin in this study. Figure 1b illustrates the evaluated score S as a function of (ϕ, ψ), and presents visually the shape and the location of the α -helix basin. Next, we derived a linear transformation from (ϕ, ψ) coordinates to (ρ, ϑ) coordinates with the following set of equations: where the errors of ρ and ϑ are Δ ρ < 0.2 [Res/Turn] and Δ ϑ < 2.4° for S > 0. Figure 1e presents the alignment score S on the resulting (ρ, ϑ) space.
In this study we distinguish between 400 naturally occurring AA pairs which we name as transitions. By naming pairs as transitions we actually emphasize the importance of directionality when dealing with polypeptides. On the (ϕ, ψ) space a transitional conformation is the (ϕ, ψ) pair describing the conformation of the transition from AA X to AA Y . For a transition advancing from N to C terminus with the following backbone atoms: N X -Cα X -C X -N Y -Cα Y -C Y , we define the transitional ψ = ψ X−>Y dihedral as N X -Cα X -C X -N Y and the transitional (c) Calculated number of residues per single α -helix turn for regions with S > 0. (d) Calculated angle ϑ between a CO normal and the helix normal for regions with S > 0. The approximate linear dependence of (ρ, ϑ) on (ϕ, ψ) may be clearly seen in both (c and d). (e) The α -helix basin HB alignment score on the linearly approximated (ρ, ϑ) coordinates system. The optimal conformation for HB alignment on the (ρ, ϑ) space is found approximately at ρ = 3.62 ± 0.2 [Res/Turn] and ϑ = 1.1 ± 2.4°. PDB animations that demonstrate the change of ρ and ϑ in α -helices can be found in supplementary material SM1_rho.pdb and SM2_theta.pdb, respectively.
This naming convention is used to describe the conformation of the α -helix backbone. Figure 2a presents the distribution of all the sampled ALA− > ALA transitions on the (ϕ, ψ) coordinates system. It is clearly observed that the (ϕ, ψ) distribution is along the ϕ + ψ = const diagonal, which is a result of the HBs along the α -helix backbone 14,21 and the sterically inaccessible regions near the α -helix basin 13,14 . Figure 2b presents the typical distribution of all the sampled ALA− > ALA transitions on the (ρ, ϑ) coordinates system. By observing Fig. 2b we immediately conclude that ALA− > ALA transitions found in PDB helices contain an average of 3.6 [Res/Turn] and a ϑ angle of about 12° relative to the helix direction normal in agreement to previous reports [22][23][24][25] . The latter is a validation of the (ϕ, ψ to ρ, ϑ) transformation applied on the PDB data, presented in Equation 1. All other transitions demonstrate very similar distributions to those of ALA− > ALA with the exception of PRO and GLY that will be discussed later in this study. The difference between the different transitions is in the mean value of the specific distribution. An interesting result presented in Fig. 2b is the symmetrical Gaussian-like fluctuations on the alternative coordinates system. The fluctuations of the measured values are the result of measurement, thermal, and other sources of noise. The symmetrical fluctuation pattern allows drawing the following conclusion: by focusing on some circular contour on Fig. 2b, the green one for instance, we deduce that the energy to cause a shift of ~0.8 [Res/Turn] is the same as the energy to cause a shift of ~15° of the ϑ angle. The implication of this observation will be further discussed within the Heterogeneous transitions section.  ψ) coordinates system. The inset shows the linear migration towards high ϕ values and low ψ values with the increasing filtering level (points in the inset stand for the overall mean value at the given filtering level). (d) The mean values of the 400 transitional pairs from B on the (ρ, ϑ) coordinates system at four filtering levels. Green contours present the hydrogen bond alignment score boundaries for S = 0, 0.2, 0.4, 0.6, and 0.8. The inset shows the migration towards low ϑ values and almost no change of ρ with the increasing filtering level (points in the inset stand for the overall mean value at the given filtering level). The samples for (c and d) are marked according to the filtering level: green circles for Level 0, blue diamonds for Level 1, black asterisks for Level 2, and red stars for Level 3.
Scientific RepoRts | 6:38341 | DOI: 10.1038/srep38341 The mean value of each of the 400 naturally occurring transitions is displayed in Fig. 2c for every one of the 4 filtering levels. The inset of Fig. 2c shows the mean value of all of these transitions for the given filtering level. We clearly observe the migration of the transitional (ϕ, ψ) pairs towards higher ϕ values and lower ψ values along the ϕ + ψ = const diagonal, which suggests that for helices with better aligned HBs we should expect higher ϕ values and lower ψ values. This raises the question of how much we can increase ϕ and decrease ψ along the ϕ + ψ = const diagonal such that the polypeptide will remain in its α -helical shape. The answer is found within Equation 3 (in Methods) that shows that the optimum of the HB alignment equation is found on the line ϕ + ψ = − 107.8° at (ϕ M , ψ M ) = (− 49.7°, − 58.1°). Figure 2d presents the same transitions as in Fig. 2c but on the (ρ, ϑ) coordinates system. A clear migration towards regions of high alignment score is observed with increasing filtering level. Furthermore, a very weak change is observed from filtering level 2 and level 3 which suggests that the maximally aligned HB conformation at (ρ, ϑ) ~ (3.62 [Res/Turn], 0°) is hard to reach. In Fig. 3 we show that the conformational optimum of (ρ, ϑ) =  is clearly observed that GLY transitions are always the leftmost with the lowest amount of residues per turn and PRO transitions are always at the bottom with lowest ϑ. GLY and PRO have a clear tendency to stay away from the overall mean helical behavior (which is denoted by the + symbol on every plot), possibly due to the high energetic cost to include GLY and PRO within α -helices 19,20 . However, GLY and PRO differ in their local effect on helix geometry. GLY strongly decreases the number of residues per turn and keeps ϑ above the overall average as is observed in all filtering levels. PRO, on the other hand, keeps the number of residues per turn above the overall mean and strongly decreases ϑ, as observed in all the filtering levels.
Interestingly, the basic AAs ARG (R) and LYS (K) demonstrate a conformational behavior that is very close to the overall mean conformation in all the filtering levels. The basic AA HIS, demonstrates the highest amount of residues per turn among all the basic AAs and the propensity to stay above the overall mean amount of residues per turn in all the filtering levels. Similarly, the polar uncharged AAs SER, THR, and GLN (Q) are always with less residues per turn than the overall mean, while the amount of residues per turn for ASN (N) is always above the overall mean.
Both ILE and VAL demonstrate a strong propensity to lower ϑ. These AAs are the only residues with two carbons at the γ position, which raises the question whether they are the reason for the observed lower ϑ angle. To assert this premise we performed Conceptual Simulations 1 and 2 as described in the Methods section and depicted in Fig. 5. The results of the inflated virtual residue in Fig. 5a shows that bulkiness near the helix backbone strongly decrease ϑ and also decrease ρ. Thus, we may deduce that since VAL is less bulky than ILE it decreases ϑ less as predicted by the conceptual simulations. Single letter amino acid (AA) codes are placed on the mean calculated conformation. Letters are colored according to the AA group: green for uncharged polar, red for acidic, blue for basic, black for hydrophobic, magenta for special. The black + symbols are placed on the mean overall conformation for every filtering level. The four filtering levels are of increasing levels of order, where Level 0 filtering is the less ordered and might even include non-helical conformations. Level 1 filtering includes only α -helical regions but with possible kinks and other types of deformations. Level 2 filtering includes a subset of conformations that must satisfy hydrogen bond (HB) alignment criterion with a weak threshold. Level 3 filtering is the most ordered filtering criterion with a strong HB alignment threshold. MALEK are the AAs with the highest helix propensity and are closely spread around the center of each filtering level, especially in filtering Level 0 and Level 1. G and P are known for their low helix propensity and show a clear tendency to stay away from the overall mean helical behavior.
Scientific RepoRts | 6:38341 | DOI: 10.1038/srep38341 THR is a polar and uncharged AA that differs from VAL by the oxygen atom on the γ position, otherwise VAL and THR are sterically very similar and might be expected to behave similarly. As may be observed in Fig. 4, THR demonstrates considerably lower ϑ angle than the other uncharged AAs (SER, ASN (N), and GLN (Q)) in all the filtering levels, with the exception of GLN (Q) in Level 0. The latter may be because Level 0 may contain non-α -helical conformations, and because of the polar nature of the THR and GLN (Q) residues. Previous reports suggest that the bulky residues stabilize the α -helix HBs and shield them from surrounding water molecules 23,25 . Our observations of bulkiness in the vicinity of the backbone for ILE, VAL, and in some cases for THR confirm that the shielding of the α -helix backbone occurs via the decrease of ϑ, since increased values of ϑ suggest the existence of nearby water molecules that destabilize the α -helical HBs 22 . The exceptions of THR in the lower filters might be explained by its polar nature.
The results for Conceptual Simulation 2 that focus on the effect of increasing distance of the virtual residue, presented in Fig. 5b, suggest that almost no conformational changes take place when the residue is inflated up to the critical value σ C = 6 Å. Above σ C we observe a decrease in the number of residues per turn while ϑ remains nearly constant. By focusing on the most bulky AAs PHE (F), TYR (Y), and TRP (W) (in increasing order of bulkiness, respectively) in Fig. 4 we indeed observe a decrease in the amount of residues per turn with the increase of bulkiness, as is confirmed by the conceptual simulation. In addition, two mismatches are evident: (1) FYW are expected to be to the left of MALEK but observed with an increased ρ (shifted rightwards in Fig. 4), and (2) FYW are expected to demonstrate a slight decrease of ϑ with the increase of bulkiness but demonstrates an increase of ϑ with the increase of bulkiness. The reason for the observed mismatches might be the naïve nature of the performed conceptual simulation that does not take into account interactions other than steric.
Since α -helices are not isolated structures and probably interact with the surrounding environment, we performed two additional simulations to understand the dependence of the helical structure on the surrounding forces. On the atomic scale, charged particles may induce local effects similar to those applied by an electric field [26][27][28] . Thus, we tested the effect of an applied electric field as illustrated in Fig. 5c. As clearly observed from the resulting plot, the electric field promotes a change of ϑ, and has a negligible effect on the amount of residues per turn, or literally the change in ϑ is proportional to the magnitude of the applied electric field. The small but still present change in the ρ axis is due to the unequal magnitudes of the partial charges of the carbonyl oxygens (− 0.5) and amide hydrogens (+ 0.33). For the hypothetical case where both of the partial charges were with the same absolute magnitude we would expect no dependency of the electric potential on the ρ axis.
In the last simulation shown in Fig. 5d we measure the contribution of an external stretching force applied only on Cα 's of the helix backbone. The plot shows the contribution of the external stretching force to the total conformational energy of the α -helix backbone. As clearly observed, stretching forces encourage the change of the number of residues per turn with a negligible change of ϑ, or literally the change in ρ is proportional to the magnitude of the externally applied force. The observed behavior demonstrates practically no change in ϑ since change in ϑ results in energetically unfavorable misalignment of the HBs. In both conceptual simulations 3 and 4, changing the sign of the electric field (direction of the applied force) will result in exactly the opposite change of ϑ (ρ).

Heterogeneous transitions.
A heterogeneous transition is a transition AA 1 − > AA 2 where AA 1 ≠ AA 2 with a total of possible 380 heterogeneous transitions for the 20 common AAs. Flory's isolated pair hypothesis (IPH) 29 which was shown to be generally incorrect [30][31][32][33] states that the (ϕ, ψ) pair is independent of its neighbors. Interestingly, IPH was never discussed in detail specifically for the α -helix basin, probably because of the difficulties distinguishing between one helix and another. Our approach of analyzing conformations on the (ϑ, ρ) space allows studying the differences even between very similar α -helical conformations and will be used test the validity of IPH for the α -helix basin. Before approaching IPH we will focus on another important question which is directly related to IPH: Can we predict the conformational behavior of a heterogeneous transition AA 1 − > AA 2 from the arithmetic mean of the homogeneous transitions: AA 1 − > AA 1 and AA 2 − > AA 2 ? To perform the comparison between a measured conformation to some predicted conformation we calculate the energetic cost of the predicted conformation relative to the measured conformation. We deduced earlier that the energy required to shift ~0.8 [Res/Turn] is the same energy required to shift ~15° of the ϑ angle for the green contour visually presented in Fig. 2b. To find more precise values of the shift, we repeated the latter calculation and found that 50% of all the measured transitions are confined within (Δ ϑ, Δ ρ) ≈ (13.5, 0.8), (10.5, 0.6), (9.6, 0.6) and (8.7, 0.5), for the four filtering levels 0-3, respectively, with an average of (〈 Δ ϑ〉 , 〈 Δ ρ〉 ) ≈ (10.6°, 0.63 [Res/Turn]). The rationale behind such averaging is to give more weight to the more aligned helical structures (higher filtering levels). If we define a single Energy Unit (EU) as the energy required to a shift of Δ ϑ = 1°, and a conversion ratio K = 〈 Δ ϑ〉 /〈 Δ ρ〉 ≈ 16.8 [°Turn/Res], we may use the following harmonic energy expression to approximate the energy of some given conformation: where ϑ 0 and ρ 0 represent the minimum energy conformation for some given transition. Since both 〈 Δ ϑ〉 and 〈 Δ ρ〉 are the average confinement values for half of the measured transitions we deduce that the border energy of the confined transitions is , which is approximately the energy of the green contour in Fig. 2b. Table 1 presents the energy difference between the measured conformations to the predicted conformations for the heterogeneous transitions for filtering Level 1 (tables for other levels may be found in Supporting Information). The conformations were predicted by calculating the arithmetic mean of the homogeneous conformations. The energy differences were calculated using Equation 2, by defining the measured conformation as the minimum energy conformation. The results suggest that the mean energy difference of all cases is ~3 [EU], that the highest energy differences are observed for transitions from PRO (AA PRO − > AA X ) and to PRO (AA X−> AA PRO ), and that in most cases the transitions carry an asymmetric nature, i.e. Δ E AA1−>AA2 ≠ Δ E AA2−>AA1 . The asymmetry issue is enough to conclude that in most cases predicted conformations will differ from the real conformations. Nevertheless, if energy differences are tolerable, the prediction of heterogeneous conformations may sometimes be useful especially when excluding PRO. In case the tolerance is set to a very small value of 1 [EU] we find that 45% (171 transitions out of 380) of the heterogeneous transitions are predictable by homogeneous averaging, and in case the tolerance is set to the half population boundary energy E 0.5 = 28 [EU], 97% (375 out of 380) of the heterogeneous transitions are predictable by homogeneous averaging. If IPH was absolutely correct than we would expect that all the values presented in Table 1 would equal to 0. Table 1 is actually a proof that IPH is incorrect when no tolerance in energy difference is allowed, however when introducing such tolerance, IPH becomes 45% correct for 1 [EU] tolerance and 97% correct for 28 [EU] (which is approximately the energy of the green contour in Fig. 2) tolerance as explained above for Level 1 filtering (the percentages increase with increased Scientific RepoRts | 6:38341 | DOI: 10.1038/srep38341 filtering). In addition, the asymmetric nature of the transitions justifies the transitional analysis approach that was done in this study and stresses that previous efforts of analyzing α -helical conformations lack important transitional information.
A possible explanation to the observed deviation between heterogeneous transitions to their homogeneous average is the interaction between residues -residues that do not interact chemically, sterically, or in any other way are expected to demonstrate stochastic conformational behavior, i.e. the average conformation of two non-interacting AAs are expected to be equal to the observed heterogeneous conformation. Thus, we can conclude that the higher the energy difference between the heterogeneous conformation to the average homogeneous, the higher the interaction between the residues with the exception of PRO that may result in high deviations because of its limited degrees of freedom.

Discussion
By representing helical structures on the (ρ, ϑ) space we attribute a meaning to the 2-dimensional representation of the α -helix in terms of residues per turn (ρ) and the CO angle of backbone carbonyls relative to the helix direction vector (ϑ). It was shown that a simple linear transformation allows for switching between (ϕ, ψ) and (ρ, ϑ) spaces, giving freedom of choice for the desired representation space. The transformation was validated by comparing our observations with those found in the literature. By using our new (ρ, ϑ) space we were able to deduce that: (1) 50% of α -helix conformations found in PDB are confined in average within (〈 Δ ϑ〉 , 〈 Δ ρ〉 ) ≈ (10.6°, 0.63 [Res/Turn]). (2) The energy required to shift the conformation by Δ ϑ = 16.8° is the same energy required to shift the conformation by Δ ρ = 1 [Res/Turn] within the α -helix basin. (3) Residues with bulkiness near to the helix backbone (the case of VAL, ILE, and THR) decrease ϑ stronger than other residues. (4) Residues with bulkiness far from the helix backbone (the case of PHE (F), TYR (Y), and THR (W)) demonstrate a decrease in ρ with increased bulkiness. (5) An environment with charged particles affects primarily ϑ. (6) External stretching/squeezing forces affect ρ. Furthermore, It was shown that representation of helical structure on the (ρ, ϑ) space has the advantage of easily calculating the conformational energy of any given α -helix, which is not applicable on the (ϕ, ψ) space. The latter allowed to approach Flory's IPH problem and to draw relevant conclusions. This study presented the α -helix basin from a novel perspective and resolutions that were not previously available.

Conformational sweep. Conformational sweep is an important method used in this study to evaluate
properties of the α -helical conformation. If P is some property of interest that is a function of the α -helical conformation, then by using conformational sweep over a binned space (ϕ, ψ space for instance) we perform the calculation of P on every binned point within the space of interest and get a resulting map P (space of interest).
Scoring the α-helical hydrogen bond alignment. HBs play a key role in the formation of the α -helical shape 1,21,[35][36][37][38][39] . Thus, the alignment score of HBs is used to analytically determine the quality of some given α -helical polypeptide segment by using the following scoring function: PDB data sampling at four filtering levels. PDB is a worldwide rapidly growing database of biological structures that is open to public access 43,44 . It includes proteins acquired by different techniques with X-RAY and NMR protein structure acquisition techniques used in 99% of the cases. To date, more than 100 K empirically determined structures are available on PDB. Transitional Ramachandran (ϕ, ψ) pairs of the common 400 AA transitions found in PDB were sampled and filtered according to four levels: Level 0 filter checks whether the (ϕ, ψ) pair is found within a predefined window where − 100° < ϕ < − 20° and − 80° < ψ < 0°. Level 1 filter checks whether the given residue pair is found within α -helical regions as determined within the PDB file usually with PROMOTIF based on DSSP 37 . Level 2 filter is the custom filter based on Equation 3 that checks the satisfactory of HBs alignment with a weak threshold of S ≥ 0.01. Level 3 filter is the same custom filter with a stronger threshold of S ≥ 0.5. The four filtering levels are of increasing levels of order, where Level 0 filtering is the less ordered and might even include non-helical conformations. Level 1 filtering includes only α -helical regions but with possible kinks and other types of deformations. Level 2 filtering includes a subset Scientific RepoRts | 6:38341 | DOI: 10.1038/srep38341 of conformations that must satisfy HB alignment criterion with a weak threshold. Level 3 filtering is the most ordered filtering criterion with a strong HB alignment threshold. The transitional conformation (ϕ, ψ) pairs were sampled into 400 2D histograms.
PDB dataset resolution and redundancy treatment. In this study we sampled all of the available conformational data found in PDB for α -helices. We did not introduce any resolution limit because of the stochastic nature of the sampled data in PDB, and since we believe that important conformational data might be found even in low resolution measurements 45 . To reduce the effects of PDB redundancy we applied a logarithmic function on the resulting distributions. In addition, the distribution of every transition was normalized by its area, such that the sum of all the possible conformations for a given transition equals to 1.
Conceptual simulation 1: residues with near bulkiness. The goal of this simulation is to demonstrate how bulky groups that are close to the α -helix backbone affect the conformation of the α -helical structure. We introduced virtual residues to every Cα on the backbone of the base model. The virtual residue is similar to ALA residue, but is attached perpendicular to the α -helix backbone while ignoring the hydrogen at the Cα position as is found in real residues. This was done to simulate the average space covered by the many possible conformations of any given residue and to allow a conceptual analysis of what happens to the α -helix when the residue is inflated. The virtual residue was maintained at a constant distance of 1.54 Å from the Cα in one case while the vdW radius (σ) of the virtual residue was changed from 3.5 Å to 9.5 Å with steps of 0.5 Å. The rationale behind keeping constant distance between the virtual residue and the Cα is to test how bulkiness at positions close to the α -helix backbone affects the conformation of the α -helix, as in the case of ILE, VAL, and THR.
Conceptual simulation 2: residues with far bulkiness. Here we repeat the same simulation described in Conceptual Simulation 1 but with an increased bond length of the virtual residue. We increased the distance of the virtual residue from the Cα and maintained it at σ − 1.96 Å. The rationale behind increasing the distance between the virtual residue and the Cα is to test how bulky residues like PHE (F), TYR (Y), and TRP (W) affect the conformation of the α -helix.
Conceptual simulation 3: external electric field. The goal of this simulation is to demonstrate the dependency of the α -helical conformation on external electric field with cylindrical symmetry. To achieve the goal we applied an electric field with its zero set at the center of the α -helix and with an increasing linear potential extending outwards. The contribution of the electric potential energy was calculated as: E Electric = E CO + E NH , where E CO = − 0.5·D Helix-O and E NH = 0.33 · D Helix-H . D Helix-O is the distance of the oxygen atom of the backbone carbonyl group from the α -helix center, and D Helix-H is the distance of the estimated hydrogen atom of the backbone amide group from the α -helix center. The estimation of the hydrogen position assumed N-H distance of 0.98 Å (sampled from PDB), that the backbone atom positions of C, N, H, Cα are on the same plane, and equal bend angles C-N-H = Cα -N-H. The coefficients − 0.5 and 0.33, are the partial electric charges of the backbone carbonyl oxygen and of the backbone amide hydrogen, respectively, according to OPLS.