Introduction

Nature effectively uses combinations of weak noncovalent interactions in the functional forms of various biologically important molecules such as nucleic acids and proteins1,2,3. Intermolecular noncovalent interactions of varying magnitude are also responsible for the existence of different states of matter4. Carbonyl-carbonyl (C═O···C═O) n→π* interactions where one of the lone pairs (n) on the oxygen atom of a carbonyl group is delocalized over the antibonding π* orbital of a nearby carbonyl C═O bond (π*C═O) along the Bürgi-Dunitz trajectory5 (O···C ═ O ~ 109°) have attracted a great deal of attention in recent years6,7,8,9,10,11. Previous studies have shown that C═O···C═O n→π* interactions not only influence geometries of important small molecules12,13,14,15 but also play crucial roles in determining the three dimensional structures of polyesters16, peptides17, peptoids18,19,20,21 and proteins22,23,24,25. C═O···C═O interactions between the side-chain and backbone carbonyl groups of Asp, Asn, Glu, and Gln were also observed in the high-resolution crystal structures of proteins26, 27. C═O···C═O n→π* interaction is characterized by a short O···C═O distance (d) of less than 3.22 Å [the sum of van der Waals radii of carbon and oxygen atom28], bond angle O···C═O (θ) of ~109° and the pyramidality (Δ, Θ) of the acceptor carbon atom towards the donor oxygen atom9, 14, 17, 25, 29. Direct spectroscopic evidence for n→π* interaction was recently reported by using gas-phase infrared spectroscopy30.

We anticipated that due to n→π* interaction both donor and acceptor C═O bonds will be polarized, which will make the acceptor carbonyl oxygen atom a better electron donor and the donor carbonyl carbon atom a better electron acceptor. The acceptor carbonyl oxygen, therefore, can donate electrons to another nearby carbonyl carbon either to form a sequential chain of O···C contacts (Fig. 1a) or it can donate electrons back to the original donor carbonyl carbon atom forming “reciprocal” n→π* interactions (Fig. 1b). Although, the sequential n→π* interactions were previously observed in poly(lactic acid)16 and proteins22, reciprocal n→π* interactions remained unexplored. Allen and coworkers reported anti-parallel arrangements of carbonyl groups in ketone dimers that were bound together by two intermolecular C═O···C═O short contacts of dipolar nature31. Maccallum et al reported a similar geometrical arrangement of carbonyl groups in right-twisted β-strands and observed two chemically distinct dipolar C═O···C═O short contacts32. However, these C═O···C═O short contacts were considerably longer than the sum of van der Waals radii of C and O atoms.

Fig. 1
figure 1

Schematic illustration of a one-sided and b reciprocal n→π* interactions. Curved dotted arrows indicate n→π* interactions

In this paper, we hypothesized that the polarization of the carbonyl groups by n→π* interactions should lead to back and forth donations between the carbonyl pairs. Based on our hypothesis, we discovered the presence of “reciprocal C═O···C═O interactions” both in small molecules and proteins. To establish the existence of reciprocal C═O···C═O interactions, we designed and synthesized model compounds and carried out X-ray crystallographic and theoretical studies. Further, we carried out Cambridge Structural Database (CSD)33 and Protein Data Bank (PDB)34 analyses to show that these interactions are widely present in small molecules and proteins. In proteins, these interactions are primarily found in random coils and turn regions. Based on our observations we propose that reciprocal C═O···C═O interactions may be a key local interaction that restricts the number of conformers of unfolded proteins and may have a role in protein folding.

Results

Reciprocal carbonyl-carbonyl interactions in N,N′-diacylhydrazines

To test our hypothesis of reciprocal n→π* interactions, we have synthesized N,Nʹ-diacylhydrazines 1-8 having various substituents on either side of the carbonyl groups (Fig. 2a). In N,Nʹ-diacylhydrazines 1-8, the two amide carbonyl groups [CO-I and CO-II; Fig. 2b] are separated by three covalent bonds and 1,5-type n→π* interactions are feasible from both sides. We propose that due to the repulsion between the nitrogen lone pairs, the N,Nʹ-diacylhydrazines should be nonplanar with the carbonyl groups orientated favorably for reciprocal n→π* interactions. Incorporation of electron donating and withdrawing substituents near the carbonyl groups in 18 should help us to tune these interactions.

Fig. 2
figure 2

Model compounds synthesized to study reciprocal n→π* interactions. a Chemical structures of N,Nʹ-diacylhydrazines (18). b Definition of different structural parameters in N,Nʹ-diacylhydrazines 18; d1 = O1···C2; d2 = O2···C1; θ 1 = O1···C2 = O2; θ 2 = O2···C1 = O1. c NBO orbital overlap between oxygen lone pair (n O) of CO-I and π*C=O orbital of CO-II of compound 6. d NBO orbital overlap between oxygen lone pair (n O) of CO-II and π*C=O orbital of CO-I of compound 6. e Plot showing correlation between O···C distances (d1 and d2) in compounds 18 [Linear fitting; Pearson correlation coefficient = 0.9906]. f Plot showing correlation between reciprocal n→π* interaction energies [E 1 (n→π*) and E 2 (n→π*)] in compounds 18 [Linear fitting; Pearson correlation coefficient = 0.938]. Curved dotted arrows indicate n→π* interactions

As anticipated, the N,Nʹ-diacylhydrazines (1-8) crystallized in nonplanar form with the carbonyl groups oriented almost orthogonal to each other (C═O···C═O dihedral angle = +70° to +85° or −70° to −85°) (Supplementary Table 1 and Supplementary Fig. 1). We observed that in compounds 13, that lack any strong electron donating or withdrawing substituent near the carbonyl groups (i.e., X = CH3, CH3CH2; Y = H, CH3), the two carbonyl groups stay far apart. The crystallographic distances d1 and d2 (Fig. 2b) in 13 are longer than 3.22 Å (Table 1) and natural bond orbital (NBO)35 calculations carried out on the high-resolution crystal geometries of 13 (Supplementary Table 2) show no evidence of n→π* interaction (Table 1 and Supplementary Table 3). In compound 4, when X is an electron withdrawing group (e.g., X = CH2Cl), an increase in the acceptor ability of the carbonyl CO-II is expected, which should increase n→π* interaction from the oxygen atom of CO-I to the π*C═O orbital of CO-II. However, the inductive electron withdrawal of Cl can be negated by the electron donation from the Cl lone pairs into the antibonding orbitals (σ* and π*) of the adjacent carbonyl group (CO-II). In compound 4, we observed such electron delocalizations from the Cl lone pairs to both σ* and π* orbitals of the C=O bonds of CO-II group, which contributed 0.67 kcal mol−1 to the stabilization (Supplementary Table 4). Such electron donation should enhance the donor ability of CO-II in 4. A short crystallographic O2···C1 distance (d2 = 3.037 Å is shorter than the sum of van der Waals radii of C and O) and presence of NBO second order perturbation energy [E 2 (n→π*) = 0.34 kcal mol−1] for n→π* interaction from CO-II to CO-I in 4 supports this assumption (Table 1). We anticipate that due to this electron donation from CO-II to CO-I, the CO-I group in 4 will be polarized and the carbonyl oxygen of CO-I will become a better donor. We clearly observed back donation of electrons from CO-I to CO-II in 4 as evidenced by a short crystallographic O1···C2 distance d1 of 3.103 Å and n→π* interaction energy [E 1 (n→π*)] of 0.10 kcal mol−1 obtained by NBO analysis. This is in accordance with our hypothesis that donation from CO-II to CO-I increases the donor ability of CO-I and acceptor ability of CO-II thereby inducing a back donation of electrons from CO-I to CO-II. Similarly, when Y is an electron donating group (e.g., Y = OCH3), CO-I is expected to be a better donor and, accordingly, we observed short distance d1 in compound 5 (Table 1). We also observed back donation from CO-II to CO-I in 5 (Table 1).

Table 1 X-ray crystallographic structural parameters and NBO data for compounds 1–8

Among the synthetic compounds 18, significantly shorter d1 and d2 are observed in 68 where electron donating or withdrawing groups are present on both sides of the carbonyl groups. The second order perturbation energies obtained by NBO calculations are also relatively higher for these compounds (Table 1). The NBO orbital overlaps between the oxygen lone pairs (n O) and π*C═O orbitals in compound 6 are shown in Figs. 2c, d. Note that in compounds 4 and 68 where X = CH2Cl or CH2Br, d2 is shorter than d1 and stronger n→π* interactions from CO-II to CO-I are observed by NBO analysis. This is due to the electron donation from the Cl or Br atom to the σ*C=O and π*C═O orbitals of CO-II, which increases the donor ability of the CO-II oxygen atom (Supplementary Table 4). Such electron donations from α-halogens to carbonyl groups and their effect on n→π* interactions were previously reported in the literature36, 37.

Interestingly, the values of O···C = O angles θ 1 and θ 2 are much smaller (~82°) in compounds 48 where reciprocal n→π* interactions are observed than in 13 that lack n→π* interactions (Table 1). In fact, the values of θ 1 and θ 2 are much smaller than what is expected for one-sided n→π* interactions (O···C = O ~ 109°) reported previously6. This may be due to the geometrical arrangement required for reciprocal n→π* interactions, which forces θ 1 and θ 2 away from the Bürgi-Dunitz trajectory.

Another important signature of n→π* interactions is the pyramidality of the acceptor carbonyl carbon atom measured by parameters Δ and Θ9, 14, 17, 25, 29. Positive values of Δ and Θ indicate pyramidalization of the acceptor carbonyl carbon towards the donor oxygen atom whereas negative values of Δ and Θ indicate pyramidalization of the acceptor carbon away from the donor oxygen atom. In compounds 18, however, we have not observed a correlation of pyramidality (Θ) with O···C distance and the strength of n→π* interactions. One reason for this could be the stronger donation from the α-halogen atoms to the nearby carbonyl, which would force the acceptor carbonyl carbons towards the halogen atoms away from the donor oxygen atoms. Also, the crystal packing forces may have some influence in the observed geometries and the pyramidalization of the two nitrogen atoms between the carbonyl groups may influence the pyramidalization of the acceptor carbonyl carbons. Moreover, the individual n→π* interactions in compounds 18 may not be strong enough to exert a significant effect on pyramidalization of the carbonyl carbons.

Overall, these data suggest that, in compounds 18, the geometrical constraints imposed by the repulsion between the nitrogen lone pairs orient the two carbonyl groups favorably for reciprocal n→π* interactions. We could tune these interactions by introducing electron donating or withdrawing substituents near the carbonyl groups. Interestingly, we observed that an increase in n→π* interaction from one side also leads to an increase in the n→π* interaction from the other side in compounds 18. This correlation suggests that n→π* interactions in these compounds could be synergistic (Figs. 2e, f). For example, shorter d1 and higher E 1 (n→π*) values are observed in 4 compared to 3 although 3 and 4 have same the substituent (4–CH3–Ph) attached to CO-I. Similarly, higher donation from CO-I to CO-II is observed in 6 compared to 5 although 5 and 6 have same the substituent (4–OCH3–Ph) attached to CO-I.

To find out if geometry optimization has any effect on the computed n→π* interactions in comparison to the unrelaxed X-ray geometries, we also carried out geometry optimizations in compounds 18 by freezing the dihedral angles of the side chains involved in reciprocal interactions to their X-ray values and freely optimizing the remaining degrees of freedom (bond lengths, angles, and dihedrals) (Supplementary Fig. 2). We observed that reciprocal n→π* interactions were retained after geometry optimizations but they became slightly weaker than what were observed from the NBO calculations on the crystal geometries (Supplementary Table 5). The coordinates of the optimized geometries of 18 are provided in Supplementary Data 1. We also observed that, during gas phase geometry optimization, in absence of any packing and intermolecular forces that are present in the X-ray geometries, the Cl or Br atoms attached to the methylene carbons in 4, 68 moved to an anti-periplanar geometry (trans) with respect to the oxygen atom of the nearby carbonyl group (CO-II). This is probably due to higher hyperconjugative delocalization between the halogen lone pairs and carbonyl π* orbital in the anti-periplanar geometry that would provide more stability to the isolated gas phase molecule. Note that such elongation of carbonyl-carbonyl (O···C) short contacts (weakening of n→π* interactions) in gas phase optimized geometry relative to the X-ray geometries are well known9, 13, 14.

Reciprocal carbonyl-carbonyl interactions in small organic molecules

To probe whether intramolecular reciprocal C═O···C═O interactions are also present in other small molecules we carried out a CSD search. In our search, we looked for organic molecules having at least two carbonyl groups with intramolecular O2···C5 (d1) and O6···C1 (d2) distances ≤ 3.2 Å (Supplementary Fig. 3). The search was carried out for cases where the two carbonyl groups are separated by at least three covalent bonds (1,5-type interaction). No restriction was imposed on the O···C = O angles (θ 1 and θ 2) during the search. The CSD search provided 1432 molecules which fulfilled our search criteria (Supplementary Table 6).

The plots showing the distribution of O···C distances (d1 and d2) and O···C = O angles (θ1 and θ2) of all the molecules obtained from the CSD search are shown in Figs. 3a, b, respectively. As can be seen from Fig. 3a, in most cases d1 and d2 fall in 2.90–3.20 Å range indicating that reciprocal interactions are in general weak. The values of θ 1 and θ 2 are mainly concentrated in the 70–100° range with majority of the molecules having θ 1 and θ 2 in the range 80–90°. Interestingly, we also observed similar values for O···C distances (d1 and d2) and O···C = O angles (θ 1 and θ 2) in compounds 48 that showed reciprocal n→π* interactions. Therefore, it is quite clear that the O···C = O (θ) angle deviates significantly from the Bürgi-Dunitz trajectory in reciprocal C═O···C═O short contacts. The d1 vs. θ 1 and d2 vs. θ 2 plots (Figs. 3c, d, respectively) show that when the angle of approach of donor oxygen atoms to the acceptor carbonyl C=O bonds deviates from Bürgi-Dunitz trajectory, the O···C distances (d1 and d2) increase, suggesting weakening of interactions. NBO analyses of crystal geometries of 30 randomly chosen molecules (Supplementary Fig. 4) having d1 and d2 ≤ 3.20 Å and covering the range of observed O···C = O angles (θ) values (70–100°) showed the presence of reciprocal interactions in them (Table 2). The NBO orbital overlaps between the oxygen lone pairs (n O) and π*C═O orbitals in one such molecule (Fig. 3e) (CCDC ref. code: JUHQEK) are shown in Figs. 3f,g.

Fig. 3
figure 3

X-ray crystallographic data and NBO overlap diagrams for CSD molecules. a Plot showing the distribution of O···C distances (d1 and d2) in molecules obtained from the CSD search. b Plot showing the distribution of O···C = O angles (θ 1 and θ 2) in molecules obtained from the CSD search. c Plot of distance d1 vs. angle θ 1 in molecules obtained from the CSD search. d Plot of distance d2 vs. angle θ 2 in molecules obtained from the CSD search. e Chemical structure of a molecule (CCDC reference code: JUHQEK) obtained from the CSD search. The amide carbonyl group is taken as CO-I and the ester carbonyl group is taken as CO-II here. f NBO orbital overlap between oxygen lone pair (n O) of CO-I and π*C=O orbital of CO-II of JUHQEK. g NBO orbital overlap between oxygen lone pair (n O) of CO-II and π*C=O orbital of CO-I of JUHQEK. [d1 = O2···C5; d2 = O6···C1; θ 1 = O2···C5 = O6; θ 2 = O6···C1 = O2 (Supplementary Fig. 3)]. Curved dotted arrows indicate n→π* interactions

Table 2 X-ray crystallographic structural and NBO data of CSD molecules

In most of the molecules obtained from the CSD search, reciprocal C═O···C═O interactions were stabilized by both n→π* and π→π* interactions between the carbonyl groups (Table 2 and Supplementary Table 7). We observed substantial C═O···C═O π→π* interactions in molecules having θ 1 and θ 2 values > 90° (Supplementary Table 7). In some cases, π→π* interactions are even stronger than n→π* interactions. When θ 1 and θ 2 values were <90°, π→π* interactions were observed for molecules having relatively shorter O···C distances (both d1 and d2 <2.90 Å) and stronger n→π* interactions. We propose that although the contribution of individual orbital interaction is small, the overall contribution of two n→π* and two π→π* interactions to the stabilization of molecules having reciprocal C═O···C═O interactions could be significant. Based on the NBO calculations at B3LYP/6-311 + G(2d,p) level, we observed that reciprocal C═O···C═O interactions contribute 0.11–3.37 kcal mol−1 (with an average value of 0.98 kcal mol−1) to the stabilization of small molecules (see the last column in Supplementary Table 7).

We observed positive values of Δ and Θ for the carbonyl carbons in most of the molecules from the CSD listed in Table 2, which indicate their pyramidalization towards the donor oxygen atoms. The plots of Θ with O···C distances and the strength of the reciprocal interactions in compounds obtained from the CSD search are shown in Supplementary Fig. 5. Although the correlation between pyramidality of second carbonyl (CO-II) carbon (Θ2) and d1 looks better than the correlation between pyramidality of first carbonyl (CO-I) carbon (Θ1) and d2, the CO-I and CO-II are chosen completely randomly in these molecules. As the pyramidalization also depends on other factors like θ and the elasticity of the carbonyl group, a strong correlation between pyramidalization and the O···C distance and strength of n→π* interactions may not be observed in these molecules having different types of carbonyl groups as well as different θ values.

To get some insights into the structures of the small molecules having reciprocal C═O···C═O interactions, we manually analyzed small molecules from the CSD having 1,5-type reciprocal interactions with both d1 and d2 ≤ 3.00 Å. A total of 249 molecules fulfill the above criteria [1, 5-interaction; both d1 and d2 ≤ 3.00 Å]. As can be anticipated, the nature of the two atoms/groups between the interacting carbonyl groups plays a key role in keeping the two carbonyl groups non coplanar and provides them the conformation required for reciprocal interactions (Supplementary Table 8). Interestingly, majority of these molecules (117, ~47%) have one heteroatom and one chiral carbon between the two interacting carbonyl pairs, a feature that resembles peptides and proteins.

Reciprocal carbonyl-carbonyl interactions in proteins

The presence of reciprocal C═O···C═O interactions in the X-ray crystal geometries of small organic molecules inspired us to look for their presence in protein crystal structures. To probe the presence of reciprocal C═O···C═O interactions in proteins, we analyzed a total of 2269 protein crystal structures with resolution ≤ 1.6 Å from the PDB with redundancy (pairwise sequence identity) less than 10%, out of which 2184 showed the presence of reciprocal interactions in them. The PDB protein structures ranked by the number of reciprocal C═O···C═O interactions present in them are included in Supplementary Data 2. For the PDB search, the distance between the carbonyl oxygen of ith amino acid residue and the carbonyl carbon of (i + 1)th amino acid residue is defined as d1. The distance between the carbonyl oxygen of (i + 1)th residue and carbonyl carbon of ith residue is defined as d2. The corresponding O···C═O angles are defined as θ 1 and θ 2, respectively (Supplementary Fig. 6). During the search, both d1 and d2 were kept ≤ 3.20 Å but no restriction was imposed on θ 1 and θ 2. The plot of d1 and d2 values obtained from the search show that most of them fall in 2.90−3.20 Å range (Fig. 4a). The angles θ 1 and θ 2 (~85 ± 15°) deviates significantly from the Bürgi-Dunitz trajectory (Fig. 4b). These observations are consistent with the trend that was observed for small molecules discussed above. Analyses of d1 and d2 for all amino acid residues in all proteins (2184) studied here show that shorter distances d1 and d2 ≤ 3.2 Å fall within the tail of the full distribution (Supplementary Fig. 7).

Fig. 4
figure 4

X-ray crystallographic data and NBO overlap diagrams for amino acid pairs. a Plot showing the distribution of O···C distances (d1 and d2) in amino acid pairs in proteins having reciprocal C═O···C═O interactions. b Plot showing the distribution of O···C = O angles θ 1 and θ 2 in amino acid pairs in proteins having reciprocal C═O···C═O interactions. The vertical and horizontal blue lines are drawn at θ 1 = 99° and θ 2 = 99°, respectively. c NBO orbital overlap between oxygen lone pair (n O) of CO-I and π*C=O orbital of CO-II of Leu-Pro (141–142) [PDB: 2x5o]. d NBO orbital overlap between oxygen lone pair (n O) of CO-II and π*C=O orbital of CO-I of Leu-Pro (141–142) [PDB: 2 × 5o]. e NBO orbital overlap between the π orbital of C = O bond of CO-I and π*C=O orbital of CO-II of Leu–Pro (141–142) [PDB: 2x5o]. f NBO orbital overlap between the π orbital of C = O bond of CO-II and π*C=O orbital of CO-I of Leu-Pro (141–142) [PDB: 2x5o]

In a previous study22, Bartlett et al reported one-sided n→π* interactions with d ≤ 3.20 Å and 99o ≤ θ ≤ 119o. As we have applied the same distance (d ≤ 3.20 Å) and resolution (<1.6 Å) criteria, the reciprocal interactions observed here for angles 99o ≤ θ 1, θ 2 ≤ 119o would be observed as one-sided n→π* interactions by using the criteria of Bartlett et al. As can be seen from Fig. 4b, the distribution of θ 1 and θ 2 in the range of 99°–119° (regions II, III, and IV) is a very small percentage (6.5%) of the total number of reciprocal C═O···C═O interactions that are being reported here. This indicates that reciprocal C═O···C═O interactions are novel and distinct from one-sided n→π* interactions reported previously.

NBO analysis of 30 amino acid pairs (Supplementary Fig. 8) with short O···C distances (both d1 and d2 ≤ 3.20 Å) that covers the complete range of observed O···C = O angle (θ) (70–110°) clearly showed the presence of reciprocal n→π* interactions (Table 3, Figs. 4c, d). Similar to CSD molecules, in proteins also we observed substantial C═O···C═O π→π* interactions between the amino acid pairs having θ 1 and θ 2 values > 90° (Supplementary Table 9). ππ* NBO orbital overlap between the two carbonyl groups in an amino acid pair is shown in Figs. 4e, f [Leu-Pro (141–142); [PDB: 2x5o]. For molecules having relatively stronger n→π* interactions (both d1 and d2 < 2.90 Å), π→π* interactions were observed for θ 1 and θ 2 values <90° also (Table 3). This indicates that the overall contribution of reciprocal interactions (two n→π* and two π→π* interactions) could be substantial to protein stabilization. Based on the NBO calculations at B3LYP/6-311 + G(2d,p) level, we observed that reciprocal C═O···C═O interactions contribute 0.27–4.41 kcal mol−1 (with an average value of 1.34 kcal mol−1) to the stabilization of proteins per amino acid pair (see the last column in Supplementary Table 9).

Table 3 X-ray crystallographic structural and NBO data for amino acid pairs from the PDB

The plot of torsion angles (φ, ψ) (Supplementary Fig. 6) of the residue between the two interacting carbonyl groups along with other residues in the proteins show that the reciprocal interactions are mainly concentrated in the polyproline II (PPII), β-turn and right-twisted β-strand regions (Fig. 5a). Unlike the one-sided n→π* interactions reported previously22, 23 that are abundant in proteins, the abundance of these newly discovered reciprocal C═O···C═O interactions is low (~7.2%). Secondary structure analyses using Stride38 show that reciprocal C═O···C═O interactions have considerable abundance in random coils (~20%) and turn regions (10%) of proteins but negligible presence in α-helices (0.35%) (Table 4). This is in contrast to the one-sided n→π* interactions that are most abundant in α-helices22, 23. As PPII helix is not included as an independent secondary structure in most secondary structure predication programs, many PPII helices remain unassigned even though they are present in the experimentally solved structures. We observed that the coil regions having reciprocal C═O···C═O interactions are dominated by PPII structures [(φ, ψ) : (−75°, 145°)]. We have confirmed this by plotting the φ, ψ angles of residues in the random coil regions having reciprocal interactions (Fig. 5b). This is not surprising given that PPII conformations are known to dominate coil regions of folded proteins39.

Fig. 5
figure 5

Ramachandran plots and analyses of reciprocal interactions in proteins. a Ramachandran plot generated by plotting torsion angles (φ, ψ) of all residues in 2184 protein structures (blue) and torsion angles (φ, ψ) of the residue between the two interacting carbonyl groups involved in reciprocal C═O···C═O interactions (yellow). b Ramachandran plot generated by plotting torsion angles (φ, ψ) of all residues in 2184 protein structures (blue) and torsion angles (φ, ψ) of the residue between the two interacting carbonyl groups involved in reciprocal C═O···C═O interactions present only in the coil regions (yellow). c Plot showing percentage distribution of amino acids involved in reciprocal C═O···C═O interactions. d Plot showing percentage distribution of amino acid pairs involved in reciprocal C═O···C═O interactions

Table 4 Distribution of reciprocal carbonyl-carbonyl interactions in various secondary structures

We also manually analyzed 789 reciprocal C═O···C═O interactions in 10 proteins having the highest numbers of reciprocal C═O···C═O interactions (Supplementary Table 10). In agreement with Stride prediction, manual inspection revealed that reciprocal C═O···C═O interactions are mostly present in coil/PPII and turn regions of these proteins. α-helices that have reciprocal C═O···C═O interactions are distorted, while the β-sheets having reciprocal n→π* interactions are twisted (Fig. 6). We also observed reciprocal C═O···C═O interactions between amino acid pairs at the interfaces of different secondary structure types (Fig. 6e and Supplementary Table 11).

Fig. 6
figure 6

Reciprocal carbonyl-carbonyl interactions in various secondary structures. a PPII-helix; b β-turn; c Right-twisted β-strand; d α-helix; e interface of α-helix and β-sheet. The Figures are generated by using PyMOL

The other secondary structure that has significant abundance of reciprocal C═O···C═O interactions is β-turn. In a β-turn, the peptide groups (NH and C = O) of the central two amino acids do not participate in any inter-residue hydrogen bonding. Therefore, we assume that these residues may participate in local reciprocal C═O···C═O interactions either between themselves or with their other neighbors, which should compensate for the lack hydrogen bonding interactions in them. A careful examination of the orientations of the carbonyl groups in various common β-turns indicated that reciprocal C═O···C═O interaction may be feasible between the first and the second residues of type Iʹ and II β-turns due to the favorable orientations of the two carbonyl groups but likely to be unfavoured in type I and IIʹ β-turns. In fact, analysis of the 10 protein crystal structures discussed above show that most of the reciprocal C═O···C═O interaction pairs found in β-turns were type II, followed by type IV (Supplementary Table 12). In 41 cases, the reciprocal n→π* interactions were present between first and the second amino acid residues while in other 43 cases they were between the third and the fourth amino acid residues of β-turns in these 10 proteins. However, in no case the second and the third residues of the β-turn were involved in reciprocal C═O···C═O interactions between them (Supplementary Fig. 9).

Analysis of distribution of reciprocal C═O···C═O interactions among various amino acids suggests that proline is involved in the largest number of reciprocal C═O···C═O interactions in various proteins followed by glutamic acid and serine (Fig. 5c). This trend is different from what was previously observed for one-sided n→π* interactions in α-helices and β-sheets22 (Pro > Gly > Ala). Analysis of distribution of reciprocal C═O···C═O interactions among the amino acid pairs in various proteins reveals that Pro–Pro is the most abundant pair (Fig. 5d). The 10 most prominent amino acid pairs that participate in reciprocal C═O···C═O interactions, all contain a proline residue (Fig. 5d). These results may be expected given the abundance of reciprocal interactions in PPII regions.

Possible role of reciprocal carbonyl-carbonyl interactions in protein folding

PPII helices and turns are the major secondary structures where reciprocal C═O···C═O interactions are observed. PPII is the major well-defined backbone structure present in denatured, unfolded, and natively unfolded proteins40 and random coil regions of folded proteins39. As PPII lacks stable non-local amino acid interactions such as hydrogen bonding, we propose that local reciprocal interactions could possibly contribute to their stability. Levinthal proposed that protein folding is speeded and guided by the rapid formation of local interactions in the unfolded state, which then determine the further folding of the peptide41. The fact that local reciprocal interactions contribute to the stabilization of the PPII conformation that are abundant in unfolded proteins, reciprocal interactions could play a role in protein folding. Also, turn regions that are stabilized by reciprocal interactions are known to act as nucleation sites for protein folding. Therefore, an open question is how important such reciprocal interactions might be for protein folding.

Nature of reciprocal carbonyl-carbonyl interactions

The nature of C═O···C═O interactions has been debated in the literature. While some consider them n→π* orbital interactions9, 11, others believe them to be dipolar in nature7, 8, 10. We have so far discussed reciprocal C═O···C═O interactions as n→π* and π→π* orbital interactions because of the following reasons. Firstly, the plots of the n→π* and sum of n→π* and π→π* orbital interaction energies against the O···C distances (d) show a strong correlation (Figs. 7a, b). In Fig. 7a, we have plotted the distances (d1 and d2 values) against the stabilization energies due to n→π* interactions [NBO second order perturbation energies E 1 (n→π*) and E 2 (n→π*)] reported in Tables 13. The plot suggests that the stabilization energies E (n→π*) for n→π* interactions decreases with an increase in the O···C (d) in synthetic molecules 18, molecules taken from CSD and interacting amino acid pairs obtained from PDB (Tables 13). The overall orbital interaction energies (sum of n→π* [E (n→π*)] and π→π* [E (π→π*)] interaction energies reported in Tables 13) plotted in Fig. 7b also show a similar correlation with O···C (d) distances. These correlations indicate that orbital interaction is the major mechanism for the stabilization of these reciprocal C═O···C═O short contacts. Secondly, we carried out NBO deletion analysis on all the molecules reported in Tables  13 (Supplementary Table 13) and observed that deletion of n→π* interactions increases charge on donor oxygen lone pair (n O) and depletes it on acceptor carbonyl π* C═O orbital, which correlate well with the strength of O···C distances (Supplementary Fig. 10a, b). Similarly, deletion of π→π* interactions increases charge on π C═O orbital of donor carbonyl and depletes it on π* C═O orbital of the acceptor carbonyl (Supplementary Table 14), which also can be correlated to the strength of C═O···C═O short contacts (Supplementary Fig. 11a, b). The overall accumulation of charges on the acceptor carbonyl π* C═O orbitals due to donation from the oxygen lone pairs and π C═O orbital of donor carbonyl is shown in Figs. 7c–d, which correlate well with the strength of C═O···C═O short contacts. This also suggests that electron delocalization is a major contributor in reciprocal C═O···C═O interactions. Finally, C = O···C = O torsion angles of the carbonyl groups involved in reciprocal interactions indicate a net zero dipole-dipole interaction eliminating the possibility of these interactions being dipolar in nature. To emphasize this point, in Figs. 7e–f, we have plotted the values of C = O···C = O torsion angles of the 1432 molecules obtained from the CSD search. The torsion angle (T) between two dipoles could be used to understand the dipolar nature of interaction between them. As we know, antiparallel (T ~ 180°) dipoles attract and parallel dipoles (T ~ 0°) repel each other whereas two orthogonal dipoles (T ~ 90°) have net zero dipolar interaction. In case of reciprocal interaction, the C = O···C = O torsion angles show an orientational preference [C═O···C═O torsion angle falls in 60° to 90° (or −60° to −90°) range] as a consequence of the simultaneous restrictions on d1 and d2 (≤3.2 Å). However, the values of the C═O···C═O torsion angles (~90°) suggest that there would be almost net zero interaction between the dipoles, eliminating the possibility of strong dipolar interactions. Therefore, we conclude that orbital delocalization is the major driving force for the stabilization of reciprocal C═O···C═O interactions. An elaborate energy decomposition analysis may be required for the accurate deconvolution of various factors contributing to the stabilization of reciprocal C═O···C═O short contacts.

Fig. 7
figure 7

Delocalization energies, charge redistribution and torsion angles. a Plot of n→π* interaction energies between the interacting carbonyl pairs against crystallographic O···C distances (d1 and d2) in molecules shown in Tables 13. When the x-axis is d1, E 1 (n→π*) is plotted in the y-axis and when the x-axis is d2, E 2 (n→π*) is plotted in the y-axis. The d1, d2, E 1 (n→π*) and E 2 (n→π*) values are taken from Tables 13. The n→π* interaction energies were computed at B3LYP/6-311 + G(2d,p) level of theory. b Plot of overall orbital interaction energy (sum of n→π* and π→π* interaction energies) between the interacting carbonyl pairs against crystallographic O···C distances (d1 and d2) in molecules shown in Tables 13. When the x-axis is d1, E 1 (n→π*) + E 1 (π→π*) is plotted in the y-axis and when the x-axis is d2, E 2 (n→π*) + E 2 (π→π*) is plotted in the y-axis. d1, d2 E 1 (n→π*) and E 2 (n→π*), values are taken from Tables 13. E 1 (π→π*) and E 2 (π→π*) values are taken from Supplementary Table 3, Supplementary Table 7 and Supplementary Table 9. The orbital interaction energies were computed at B3LYP/6-311 + G(2d,p) level of theory. c Plot of accumulation of charges on the π*C=O orbital of CO-II due to donation from lone pairs of oxygen and π C=O orbital of CO-I against d1. d Plot of accumulation of charges on the π*C=O orbital of CO-I due to donation from lone pairs of oxygen and π C=O orbital of CO-II against d2. The solid curves in a-d are drawn for convenience. e Histogram plot showing the frequency of the C1═O2···C5═O6 dihedral angles (see Supplementary Fig. 3 for atom numbers) for 1432 molecules obtained from the CSD search. f Histogram plot showing the frequency of the C5═O6···C1═O2 dihedral angles (see Supplementary Fig. 3 for atom numbers) for 1432 molecules obtained from the CSD search

We conclude that reciprocal carbonyl-carbonyl interactions exist both in small organic molecules and proteins. However, due to geometrical constraints associated with such interactions, the approach of the donor oxygen atoms to the acceptor carbon atoms deviates significantly from the Bürgi-Dunitz trajectory, and therefore, electron delocalization between the oxygen lone pair (n O) and π*C═O orbital is weak. This weak donation from the first carbonyl group to the second is compensated by a back donation from the second carbonyl group to the first. In many cases, reciprocal π→π* interactions were also observed along with reciprocal n→π* interactions and their overall contributions to the stabilization of molecules having reciprocal C═O···C═O short contacts could be significant. In proteins, C═O···C═O n→π* interactions are present in all types of secondary structures. While one-sided n→π* interactions are prevalent in α-helices22, 23, reciprocal interactions are abundant in PPII helices and turn regions. Prevalence of reciprocal C═O···C═O interactions in PPII helices and turn regions of proteins suggests a possible role for these interactions in protein folding. Further, the presence of reciprocal C═O···C═O interactions in distorted α-helices and twisted β-sheets suggests that these interactions could stabilize secondary structures that deviate from their regular geometries. The reciprocal C═O···C═O interactions present at the interface of two different types of secondary structures could also help in stabilizing the strained amino acid residues that are present at these interfaces. In future, it would be interesting to investigate the ability of amino acid pairs having high propensity to get involved in reciprocal C═O···C═O interactions to stabilize PPII helices and β-turns. It would also be interesting to investigate if some non-peptidic fragments obtained from the CSD search having strong reciprocal C═O···C═O interactions could be used to stabilize PPII conformation or design peptide-turns. Finally, an energy decomposition analysis would provide better understanding of the forces that contributes to the stabilization of reciprocal C═O···C═O interactions.

Methods

Crystallization method

Single crystals of compounds 18 were grown by slow evaporation. Various solvent combinations were used to crystallize the compounds either at room temperature or low temperature (4 °C). Details of the crystallization conditions are given in Supplementary Table 1.

X-ray crystal structure determination method

Single crystal structures of compound 18 were determined by measuring X-ray intensity data. Bruker D8Venture APEX 342 single crystal home source X-ray diffractometer equipped with CMOS PHOTON 100 detector and Monochromated microfocus sources Mo Kα radiation (λ = 0.71073 Å) were used for data collection in phi (ϕ) and omega (ω) scan strategy at room temperature (298 K). The data was processed using SAINT 43 and absorption correction was done using SADABS 44 implemented in APEX 3. For structure solution XSHELL program based on SHELX45 was used. The non-hydrogen atoms were refined anisotropically and located in successive difference Fourier syntheses. The hydrogen atoms were fixed to neutron bond length using appropriate HFIX commands. ORTEP diagrams of compounds 18 (CCDC 1486577- 1486584) is provided in Supplementary Fig. 1. Compound 5 crystallized with a water molecule in the asymmetric unit. However, for clarity we have not shown the water molecule in its ORTEP diagram. Compound 7 has disorder at chlorine atom; the occupancy of disordered chlorine atom namely Cl1A and Cl1B was refined using the PART command. Similar ADP restraint SIMU46 and rigid bond restraint DELU46 was applied to stabilize the anisotropic refinement. SADI46 instruction was used to restrain the distance to equal. The anisotropic displacement parameter for disordered chlorine atom was fixed using EADP46 constraint.

CSD analysis

Intramolecular C═O···C═O noncovalent interactions were searched and structural data were retrieved from Cambridge Structural Database33 (CSD version 5.21 Nov. 2015) using Conquest47 (version 1.18) program. The fragment chosen for the search is shown in Supplementary Fig. 3, where X is indicative for any atom. Only unique matching fragments were taken and the fragment was chosen in such a way that there are at least two carbonyl groups irrespective of their nature. Distances d1 (O2–C5) and d2 (O6–C1) are restricted to ≤3.2 Å. Angles [O2–C5–O6 (θ 1) and O6–C1–O2 (θ 2)] and dihedral angles (C1–O2–C5–O6 and C5–O6–C1–O2) were printed without any restriction. Only crystalline, non-ionic and non-polymeric organic molecules having no disorder and error with R factor ≤ 5% having at least three covalent bond separations between the carbonyl groups were considered in this search.

PDB analysis

A subset of 2269 protein was culled out from RCSB PDB34 using a search criterion of resolution <1.6 Å with redundancy (pairwise sequence identity) less than 10%, downloaded on 19 January 2016. Out of 2269 proteins, 2184 showed the reciprocal n→π* interaction. For proteins existing in polymeric form or for proteins containing amino acids in more than one conformation, Chain A and conformation A were chosen, except for 57 proteins where chain A is absent. Distance d1 is defined as distance between the i th amide oxygen to the subsequent (i+ 1)th amide carbon, while d2 is defined as distance between the (i+ 1)th amide oxygen to the i th amide carbon (Supplementary Fig. 6). We used d1 ≤ 3.2 Å and d2 ≤ 3.2 Å criteria for selecting amino acid pairs participating in reciprocal n→π* interactions. Secondary structure assignment was done using the Stride code39. Ramachandran plots were generated for the proteins using Gnuplot (http://www.gnuplot.info/).

Computational methods

All the calculations were performed by using Gaussian09 suite of quantum chemistry programs48. The Hartree-Fock (HF)49 and the hybrid Becke 3-Lee-Yang-Parr (B3LYP)50, 51 exchange correlation functional with 6-311 + G (2d,p) basis set were used for the calculations. Natural bond orbital (NBO)35 analyses were performed on the crystal geometries of the synthetic molecules and small organic molecules obtained from CSD search. For proteins, the coordinates of the interacting amino acid residue pair were extracted using PyMOL52. The α-carbons of the amino acid residues adjacent to N and C termini of the amino acid pair were also included, so as to mimic a dipeptide with N and C termini capped with N(CO)Me and (CO)NMe, respectively. Finally, hydrogen atoms were added to the structure using PyMOL (Supplementary Fig. 8). NBO analyses were carried out on crystal geometries at B3LYP/6-311 + G(2d,p) and HF/6-311 + G(2d,p) level of theory. The NBO second order perturbative energies E (nπ*) and E (ππ*) obtained from NBO calculations were taken as the stabilization energy due to n→π* and π→π* interactions. NBO deletion analysis was carried out on crystal geometries at HF/6-311 + G(2d,p) level of theory.

Data availability

The authors declare that the data supporting the findings of this study are available within the paper and its Supplementary Information files, and also are available from the corresponding author upon reasonable request. X-ray crystallographic data for structures reported in this study have been deposited at the Cambridge Crystallographic Data Centre (CCDC), under deposition number CCDC 1486577-1486584. These data can be obtained free of charge from the CCDC via www.ccdc.cam.ac.uk/.