Introduction

Calcium signalling plays a pivotal role in diverse physiological processes, including lymphocyte activation, neuronal firing, muscle contraction and apoptosis1. Under resting conditions, cells maintain a 20,000-fold gradient in Ca2+ concentration across the plasma membrane that is rapidly dissipated and restored after stimulation2. As a consequence, hundreds of Ca2+-binding proteins have evolved whose affinity for Ca2+ collectively spans the nM to mM range2,3. Broadly, these can be classified into proteins that act as Ca2+ sinks to buffer the intracellular environment and adaptor proteins that act as Ca2+ sensors, undergoing large conformational changes upon binding to carry out distinct effector functions in the absence and presence of Ca2+ (ref. 4).

Calmodulin (CaM) is the prototypical eukaryotic Ca2+ sensor and regulates a diverse set of proteins, including G-protein-coupled receptors, small conductance potassium (SK) channels and CaM kinases4,5. Structurally, CaM is comprised of two lobes connected by a short linker6. Each lobe, in turn, contains a pair of EF hand motifs that bind two Ca2+ ions cooperatively7. Extensive NMR and small-angle X-ray scattering studies have shown that CaM is a dynamic protein, with its inter-domain linker exhibiting considerable flexibility and the C-lobe undergoing conformational exchange in the μs regime8,9. Comparison of high-resolution structures of apo (Ca2+-unbound) and holo (Ca2+-bound) CaM clarified the functional role of these dynamics by indicating that Ca2+ binding induces large-scale structural rearrangements that expose each lobe’s hydrophobic interface to CaM-binding partners6,10 (Fig. 1). These binding partners commonly feature a pair of bulky hydrophobic residues that serve as anchors to the hydrophobic interface presented by each lobe of Ca2+-bound CaM, which can accommodate variable spacing between these anchor residues due to the flexibility of the inter-lobe linker11.

Figure 1: C-CaM conformational change upon Ca2+ binding.
figure 1

(a) Comparison of the C-terminal domain from published structures of apo (1CFD) and holo (1CLL) CaM indicates the large-scale structural rearrangements that occur when Ca2+ binds. (b,c) In particular, repositioning of the Ca2+-ligating residues from solvent-exposed rotamers renders each EF hand competent to bind Ca2+. (d,e) Additionally, binding of Ca2+ induces repacking in the hydrophobic core and binding interface, enabling compact clustering of the aromatic residues outlined by the mesh.

However, this prevailing paradigm of CaM function has been challenged by the discovery of proteins that preferentially bind to apo CaM and tune its affinity for Ca2+ (refs 5, 12). Furthermore, CaM-binding partners share no sequence homology4, and attempts to categorize them based on structural features are riddled with exceptions11,13. Thus, a holistic framework that accounts for CaM’s binding diversity is lacking. Prior computational work has focused on CaM equilibrium dynamics, with particular emphasis on dissecting the temperature- and Ca2+-dependence of the transition rates among the unfolded, apo- and holo-states and their relative populations14,15. While these studies have helped resolve conflicting interpretations of NMR and single-molecule force spectroscopy studies by distinguishing between states that are difficult to resolve experimentally14,16,17, they have been too coarse-grained to address the mechanism of these transitions at atomistic detail. Furthermore, only the question of Ca2+-binding has been addressed, while the structural basis underlying CaM’s ability to interact specifically with a diverse set of binding partners has not been broached computationally.

Molecular dynamics (MD) simulations have been harnessed to study the dynamics of complex biological systems at atomic resolution and predict the mechanisms that underlie transitions among biologically important states18,19,20. Two challenges faced by MD simulations are the need for a framework to deal with the high dimensionality of the resulting data set and the requirement of highly specialized hardware to obtain atomistic dynamics of proteins with timescales comparable to the conformational process under investigation21. Markov state models (MSMs) are a powerful analytical tool that overcomes both obstacles by merging massively parallel short simulations into a unified, more accessible model that captures rare events whose timescales exceed those of the individual simulations21,22.

In this study, we constructed MSMs from MD simulations of the C-lobe of CaM (C-CaM) in the Ca2+-bound and unbound regimes to explore the contributions of a single domain of CaM to binding diversity. In particular, we performed MD for a total aggregate simulation time of 700 μs with an all-atom description of protein and an explicit representation of water. This study is in contrast to previous computational investigations of CaM conformational dynamics, which relied either on coarse-grained models14 or short (ns) simulations23,24. Our MSM-based approach resolved the experimentally observed apo to holo and unfolding transitions as orthogonal processes, while comparison with available high-resolution structures indicated that the folded ensemble of C-CaM populates multiple, topologically distinct binding interfaces that may account for binding specificity and diversity by enabling conformational selection. In addition, our models predict novel binding interfaces that could be stabilized by structurally matched targets, thus disrupting downstream signalling by sequestering CaM from the active pool of signalling molecules. Given that CaM is a pivotal hub in signalling pathways that are frequently deregulated in disease25,26, a sophisticated framework for understanding target recognition by CaM could help make it a tractable candidate for pharmacological activation or inhibition.

Results

Distinct exchange processes characterize apo- and holo-C-CaM

To investigate the contributions of C-CaM to CaM’s binding diversity, we ran MD simulations of this domain in the absence and presence of Ca2+. Inspection of the corresponding high-resolution structures of apo- and holo-CaM from which these simulations were started demonstrates the gross structural rearrangements that occur upon Ca2+ binding (Fig. 1a)27,28. In addition to repositioning of the residues that ligate the Ca2+ ions in each of the Ca2+-binding sites (Fig. 1b,c), Ca2+ binding is accompanied by reorientation of the EF hand α-helices to achieve a more compact clustering of the four aromatic residues in the C-CaM hydrophobic core (Fig. 1d,e). However, these are static depictions that offer atomic-level structural information but minimal insight into the dynamics that govern each state or the mechanism underlying their exchange. MD simulations allowed us to bridge this gap, preserving the atomic-level detail afforded by high-resolution structures while providing extensive kinetic information that can be unified and interpreted through the framework of MSMs.

Constructing MSMs that accurately recapitulate MD data requires a valid decomposition of phase space. We partitioned the conformational landscape explored by apo- and holo-C-CaM using time-structure-based independent component analysis (tICA), a distance metric generated by projecting high-dimensional data onto degrees of freedom (in this case, distances in the contact map) that decorrelate slowly29. MSMs were then built from the clustered data at a Markovian lag-time (Supplementary Fig. 1) to analyse the dominant pathways of conformational exchange. Building MSMs from the tIC-partitioned phase space allowed us to analyse the rates and pathways of conformational change that govern apo- and holo-C-CaM dynamics. The first eigenvector of the transition probability matrix defining each MSM provides an estimate of the equilibrium populations. In the absence of bound Ca2+, conformations similar to the solution structure of apo-C-CaM dominate, with the holo-like state accounting for a minor but significant population (P0.14), similar to the minor 5–10% population observed in NMR studies9,27,30. These values are within 1 kcal of each other and thus difficult to distinguish experimentally or computationally (Supplementary Fig. 2a). By contrast, most of the population in the Ca2+-bound regime is concentrated in a state that resembles the crystal structure of holo-C-CaM (P0.93) (Supplementary Fig. 2b). To further validate these computational models, we estimated NMR order parameters from each MSM ensemble and found good agreement with experimental values (Supplementary Fig. 3 and Supplementary Tables 1–2)30,31,32 These order parameters measure local flexibility, so the high correlation suggests that the MSM ensembles accurately reproduce solution-state dynamics. We additionally verified that the conformational landscape explored by apo-C-CaM was invariant to force field choice by re-running our simulations in a different force field. The population distributions along order parameters characteristic of the apo- to holo-like transition and local unfolding were similar between apo-C-CaM simulations run in distinct force fields, and equally distant to the population distribution of the holo system (Supplementary Fig. 4 and Supplementary Table 3).

Sampling from this equilibrium description recovers the dynamic processes and the timescales at which they occur. In the absence of Ca2+, exchange with the holo-like state occurs with a kex of 26 μs, within the range of 18–150 μs that has been experimentally determined for the isolated C-terminal domain (Fig. 2a)30,33,34,35,36,37. Transitions to the holo-like state coincide with aromatic stacking and formation of the second Ca2+-binding site, as assessed by inter-residue distances between Phe92 and Phe141 and between the Ca2+ ligands Asp129 and Asp133, respectively. On the other hand, the first Ca2+-binding site remains poorly formed even upon transition to the holo-like state, which likely accounts for its lower affinity for Ca2+ (ref. 30) and structurally distinguishes the holo-like and holo states. Exchange along these order parameters correlates well with the second eigenvector (Supplementary Fig. 5a), indicating that the apo- to holo-like transition is the slowest dynamic process in the absence of Ca2+.

Figure 2: Kinetics of conformational change.
figure 2

Sampling from the MSM captures conformational changes and the timescales at which they occur. (a) In apo-C-CaM, transition to a holo-like state coincides with aromatic stacking and formation of the second, but not the first Ca2+-binding site. (b) Holo-C-CaM experiences fluctuations in the inter-EF hand loop region but does not show sustained fluctuations in the aromatic cluster or either Ca2+-binding site. All distances are shown in Å.

By contrast, motions in the loop connecting the EF hands account for the slowest dynamic process of the holo system, with a relaxation timescale of 10 μs (Fig. 2b). Specifically, twisting of this region reorients Leu116 away from solvent, allowing it to pack against Met124 and Met109, while adjacent Thr117 flips away from the hydrophobic cluster in this region (Supplementary Figs 5b, 6). These dynamics are not characteristic of the apo- to holo-like transition. Similarly, the conformational change processes that distinguish the apo MSM do not typify the excursions from the dominant population in the holo MSM, indicating that that the dynamics in the Ca2+-bound and unbound regimes are distinct (Fig. 2).

Computationally discovered metrics reveal complex dynamics

The ability of our MSMs to recover experimentally determined populations and rates indicates that tICA is a robust method for discretizing the phase space explored by apo- and holo-C-CaM. The dominant tICs—that is, those with the largest eigenvalues in the time-lag covariance matrix—represent the slowest decorrelating orthogonal linear combinations of distances in the contact map29. Thus, deducing the molecular correlates of the tICs provides insight into the most important degrees of freedom in the conformational landscape of each system. Projecting the conformations visited by apo- and holo-C-CaM onto their respective tICs confirmed that within each system, the top tICs capture distinct dynamic processes (Fig. 3a,b). In the apo system, the first and second tICs correspond to exchange with a holo-like form and a partially unfolded state, respectively (Supplementary Fig. 7a), consistent with prior studies that identified these as the dominant dynamics in the unliganded regime14,17,30. Projecting the conformational space explored by holo-C-CaM onto these same tICs resulted in a density map that overlaid well with the holo-like state explored by apo-C-CaM, but there was no indication that holo-C-CaM populates an apo-like state or undergoes the cracking experienced by the Ca2+-unbound state (Fig. 3c).

Figure 3: The top tICs capture distinct dynamic processes.
figure 3

The energy landscapes of (a) apo- and (b) holo-C-CaM were generated by projecting the system's conformations, weighted by their MSM probabilities, onto their respective tICs. Comparison was made to available experimental structures by projecting the C-lobe of these published structures onto the relevant tICs (white dots). (c) Projection of the holo conformations on the apo tICs and (d) the holo-like basin in the apo system (boxed in red in a) on the holo tICs underscores their different dynamics. Free energy values are reported in kcal mol−1. The 25 inter-residue distances whose dynamics most contribute to the top (e) apo and (f) holo tIC are visualized on cartoon representations of the published structures of apo- and holo-C-CaM. Red and blue lines denote positive and negative tIC values, respectively.

Instead, the dominant tICs that characterize holo-C-CaM correspond to different rearrangements of the hydrophobic core (Supplementary Fig. 7b). However, these tICs are specific to the holo system, and the first tIC in particular fails to capture the dynamics of the holo-like state of apo-C-CaM (Fig. 3d), again underscoring that dynamics of C-CaM in the Ca2+-bound and unbound regimes are distinct. Visualizing the dominant tIC of each system as a contact map confirms this: the region that most contributes to the top tIC is localized to the G helix and inter-EF hand loop region for holo-C-CaM but more distributed throughout the protein for apo-C-CaM (Fig. 3e,f and Supplementary Fig. 8). The complexity of these contact maps underscores the need for order parameters that can capture such intricate and non-uniform dynamics. Indeed, the commonly used order parameter of root-mean-square distance (RMSD) provides much poorer separation of the apo- and holo-like basins in the Ca2+-unbound regime, and fails to resolve dynamic exchange in the Ca2+-bound regime entirely (Supplementary Fig. 9).

Hydrophobic repack dominates the apo to holo-like transition

Applying transition pathway theory to MSMs predicts the ensemble of pathways between states and estimates their associated fluxes. In the apo MSM, a structural rearrangement characteristic of the transition to the holo-like state is compact stacking of four aromatic residues in the hydrophobic core, which is impeded in the apo state by packing of Val108 between Phe89 and Phe92 (Fig. 1d,e). Over 60% of the flux passes through an intermediate, in which packing of Val108 against Phe89 (but not Phe92) is disrupted, but only in the holo-like state does the helix containing Val108 shift sufficiently to accommodate the compact aromatic cluster (Fig. 4a). Repacking of the core orients Phe89 and Phe141 in the bottom of a hydrophobic cavity, an arrangement that mimics the surface topology of Ca2+-bound C-CaM and is late to form along the principle transition paths (Fig. 4b). Interestingly, a common structural motif in CaM’s diverse binding partners is the presence of a bulky aromatic feature that accesses this hydrophobic pocket and stacks against the Phe residues that line one side of the cavity (Fig. 4c)38,39. Thus, rearrangements in the hydrophobic core facilitate the presentation of a specific surface topology that contributes to binding specificity.

Figure 4: Hydrophobic repack determines the topology of the binding interface.
figure 4

(a) Packing of Val108 between Phe92 and Phe89 is disrupted during the apo- to holo-like transition, which enables the formation of a compact aromatic cluster in the C-CaM core. (b) Surface representations of the apo, intermediate, and holo-like states show that repack exposes distinct surface topologies. Phe, all other hydrophobic, and polar residues coloured red, yellow and grey, respectively. (c) The Phe-lined cavity formed in the holo-like state is commonly accessed by bulky aromatic rings of binding partners, as shown in the crystal structure of holo CaM in complex with trifluoperazine. Only the C-terminal domain of this structure (1CTR) is shown.

Another intriguing question about the mechanism underlying this transition is whether unfolding plays a role. An unfolding process on the μs timescale has been observed in T-jump spectroscopic studies17, and although exchange with the holo-like state rather than the unfolded state is predicted to dominate at low temperatures, the unfolded population is still estimated to be 1% in this regime14. Our MSM of apo-C-CaM populates a landscape that is qualitatively similar to the one explored by the coarse-grained Gö model of Chen et al.14 (Supplementary Fig. 10). However, the higher resolution of our simulations reveal that unfolding is not global, but rather a localized phenomenon that primarily involves unwinding of helix G or deformation of the second Ca2+-binding site (Fig. 5a). Less than 3% of the flux between the apo- and holo-like states visits a partially unfolded intermediate (using lower cutoffs of 11 and 15 Å for the Met124-Ala128 and Ala128-Gly134 inter-Cα distances, respectively, to categorize states as partially unfolded). This indicates that unfolding is largely irrelevant for the apo- to holo-like transition, despite prior speculation that intrinsic disorder within C-CaM might be a mechanism underlying its diverse binding repertoire14. Furthermore, we find no evidence of partial unfolding in the restricted dynamics of holo-C-CaM (Fig. 5b), consistent with the observation of increased stability in the Ca2+-bound regime40. Because force field choice has been shown to impact secondary structure stability, we compared simulations run in different force fields to ensure that the observed local unfolding was invariant to this choice. As has been previously observed41, median helical content is slightly higher in the CHARMM36 simulations than the Amber99sb-ildn trajectories (Supplementary Fig. 4c). However, these distributions are broad and overlap between the different force fields. Furthermore, the conclusions drawn about local unfolding are made based on specific inter-residue distances, which show similar distributions between force fields, rather than helical content (Supplementary Fig. 4a,b).

Figure 5: Local unfolding occurs in apo but not holo-C-CaM.
figure 5

The energy landscapes of (a) apo and (b) holo-C-CaM as a function of the Met124-Ala128 and Ala128-Gly134 inter-Cα distances, which correlate with distortions in helix G and the second Ca2+-binding site, respectively. The regions that span these residues are highlighted in yellow in the cartoon representations below and to the side of the plot in (a). Conformations were weighted by their MSM probabilities, and free energy values are reported in kcal mol−1. In a, the five paths with the greatest flux between the apo- and holo-like states are plotted as grey lines.

Apo- and holo-C-CaM populate topologically distinct ensembles

Given the unimportance of unfolding to the ability of C-CaM to bind structurally dissimilar substrates, we sought to identify a mechanism that could explain this binding diversity by comparing published structures of CaM-target complexes to our MSMs. Mapping 25 of these high-resolution structures to tIC space revealed that C-CaM explores the configurations stabilized by its peptide and small molecule-binding partners even in their absence (Fig. 6). Because our simulations started only from the published apo and holo structures, recovering these other conformations adopted in the presence of CaM-binding partners was not guaranteed. Instead, this pre-existing heterogeneity suggests that conformational selection may facilitate CaM’s binding diversity. As expected, most of the structures mapped to the holo-like and central basins in apo- and holo-tIC space, respectively (Fig. 3a,b). The two outliers were the structures of CaM in complex with myosin and an SK channel splice variant (4BYF and 1G4Y, respectively); in both crystal structures, the C-lobes were found to be apo, consistent with their physiological role42,43.

Figure 6: C-CaM presents an ensemble of hydrophobic interfaces in the absence of binding partners.
figure 6

(a) Energy landscape of apo-C-CaM as a function of the Phe92-Phe141 and Asp133-Glu140 inter-residue distances, which correspond to aromatic clustering and formation of the second Ca2+-binding site, respectively. White and grey dots represent these values for published structures and topological cluster centres, respectively. (b) Surface representations of the cluster centres indicated by number in a, with Phe, Met, all other hydrophobic, and polar residues coloured red, cyan, yellow and grey, respectively. (c,d) As in (a,b), except for holo-C-CaM, with inter-residue distances between hydrophobic residues at the binding interface (specifically, Leu112-Met145 and Val108-Phe141) used as order parameters. Free energy values are reported in kcal mol−1.

Curiously, in apo-tIC space these outliers mapped to an intermediary position between the apo- and holo-like basins rather than fully in the former basin despite the absence of Ca2+ (Fig. 6a). As described above, transition between these basins involves extensive structural reorganization, including formation of the second Ca2+-binding site and hydrophobic repack. Because the first tIC encompasses both rearrangements, we examined the extent to which these features are coupled in the dynamics of apo C-CaM. Mutual information calculations provide a measure of long-range coupling between residues by computing the correlation between the motions of their backbone and side-chain torsion angles44. Applying this analysis to apo-C-CaM indicated extensive allosteric communication between the second Ca2+-binding site and the hydrophobic residues that constitute the binding interface (Supplementary Fig. 11). In particular, the aromatic residues Phe89 and Tyr138 (Supplementary Fig. 12) show a strong coupling with the charged residues (Asp129, Asp131 and Asp133) that ligate the Ca2+ ion. By contrast, the first Ca2+-binding site, which has lower affinity for Ca2+ (ref. 30), does not show a strong dynamical coupling with any region of the protein (Supplementary Fig. 10).

The majority of flux between the apo- and holo-like states passes through a reaction tube, in which these features change in a concerted fashion, while configurations in which the Ca2+-binding sites are fully formed but the aromatic cluster is not (and vice versa) are energetically destabilized by several kT (Supplementary Fig. 13). Consistent with this, splice variants of an SK channel that are known to differentially tune the affinity of the C-lobe for Ca2+ stabilize distinct hydrophobic surface topologies. While the splice variant that enhances Ca2+ binding interacts with the canonical Phe-lined cavity, the SK2-a splice variant that abolishes Ca2+ binding stabilizes a less compact Phe-lined groove that is also observed in the structure of apo CaM in complex with myosin (Supplementary Fig. 14)5,42,43. An exception to the concerted rearrangements in the hydrophobic network and Ca2+-binding sites was observed in the NMR structure of CaM in complex with phosphorylated nitric oxide synthase, which features the canonical holo-pocket topology but has poorly formed Ca2+-binding sites (Supplementary Fig. 14). However, in this case it has been suggested that proximity of the phosphate to the second Ca2+-binding site may prevent closure of the charged residues that ligate Ca2+, thus decoupling the allosteric network by an unusual mechanism12.

Despite the sequence dissimilarity between the regions of myosin and SK2-a that bind C-CaM, the Phe-lined grooves they stabilize are strikingly similar (Supplementary Fig. 14)42,43. To predict which other surface topologies are well-represented and thus amenable to conformational selection, we clustered configurations from equilibrium sampling of the apo and holo MSMs based on RMSD of the 10 residues that constitute the hydrophobic network in C-CaM. In the case of the apo system, cluster centres that resembled the canonical apo and holo topologies were recovered, with similar populations (P0.58 and 0.19, respectively,) to the apo- and holo-like states in the original MSM. Additionally, a Phe-lined groove similar to the one stabilized by myosin and SK2-a was observed at lower frequency (P0.07) (Fig. 6a,b and Supplementary Fig. 14).

In the case of the holo system, the canonical Phe-lined cavity is most commonly observed (P0.56), while a significant alternative topology features a Met-lined pocket (P0.38) (Fig. 6c,d). Although this latter topology is not observed in the repertoire of currently published CaM structures, we predict that it could be readily stabilized by conformational selection based on its distinctive topology and high population in the Ca2+-bound regime. This prediction is bolstered by the experimental observation of a binding interface that, in terms of the relative positioning and exposure of the Phe and Met clusters, is intermediary between the canonical Phe-lined cavity and our predicted Met-lined pocket (Supplementary Fig. 15). In this case, an anti-microtubular agent stabilizes an unusual topology featuring an exposed aromatic cluster adjacent to a Met-bordered depression45, supporting the potential for other small molecules to sequester alternative topologies presented by holo-C-CaM. Interestingly, the Phe-lined cavity and Met-lined pocket we observe in our simulations are not only sterically but also chemically distinct. As noted above, the Phe cavity provides the opportunity for aromatic stacking and thus commonly accommodates the indole or benzyl rings of CaM-binding partners38,39. Similarly, the greater polarizability of Met compared with more canonical hydrophobic residues is expected to endow the Met pocket with distinct binding opportunities.

Discussion

Despite the wealth of structural information of CaM in complex with a variety of its binding partners, a unified framework that accounts for its ability to interact specifically with a diverse repertoire of proteins and small molecules has been lacking. In this study, we use MSMs to join massively parallel, short MD simulations of the isolated C-lobe in the Ca2+-bound and unbound regimes to arrive at a more holistic understanding of the dynamics that give rise to CaM’s functional plasticity.

Constructing MSMs requires partitioning phase space with distance metrics that distinguish important degrees of freedom in a protein’s dynamic landscape. The complexity of the structural features captured by tICA suggests that even for a single domain of a small protein, simple order parameters are insufficient to describe the intricate dynamics of conformational change. Linear combinations of several simple order parameters have already been used to study conformational change; however, the order parameters in these studies were chosen based on already available structural information18,19,46,47. In this study, we show that computationally discovered order parameters can readily distinguish between the conformational space explored by apo- and holo-C-CaM, despite being agnostic to available structural information. Furthermore, the tICA method, which identifies slowly decorrelating degrees of freedom in the contact map, provides a superior separation of dynamic processes that are too subtle or complex for more global metrics like RMSD.

Using this unbiased approach to identify the dominant dynamics of C-CaM, we investigated the origins of CaM’s ability to bind structurally diverse protein targets. Recently, intrinsic disorder has gained considerable attention as a mechanism for generating binding diversity: initial contacts formed between an intrinsically disordered protein and different substrates seed the formation of distinct secondary structures48. However, we find that the unfolding process of apo-C-CaM previously observed14,17 is irrelevant to the apo- to holo-like transition and thus off-pathway for target binding. In contrast to the induced fit mechanism of intrinsic disorder, our simulations indicate that conformational sub-states within the folded ensemble of apo-C-CaM present distinct surface topologies that may facilitate binding of diverse targets. Although a direct comparison of target binding to different sub-states of CaM would be required to determine whether induced fit or conformational selection dominates, our simulations indicate that the pre-requisite heterogeneity for the former mechanism exists. Furthermore, the sub-states’ relative populations predicted by our models can be used to estimate the fold-increase in binding rate required for one mechanism to outcompete the other. Prior studies investigating the process of peptide binding have primarily focused on dynamics of the holo state31,49,50, but our models indicate that the apo ensemble also presents binding interfaces stabilized in crystal structures of CaM-peptide complexes. Furthermore, allosteric communication between the hydrophobic network and the Ca2+-binding sites provides an opportunity for binding of a target protein to tune the affinity of CaM for Ca2+ and vice versa5,12,51.

Hydrophobic interactions are commonly viewed as relatively non-specific compared with salt bridges and hydrogen bond networks, but the hydrophobic binding interfaces presented by the apo- and holo/holo-like structures are sterically and chemically distinct and thus can provide specificity. Given the observed conformational selection, we used our MSMs to identify alternative binding interfaces that are well-populated by apo- and holo-C-CaM that we predict could be readily stabilized by a structurally matched substrate. As expected, we recovered the Phe-lined cavity and groove topologies that have been observed in crystal structures of CaM-target complexes at high frequency5,52,53. In addition, we identified a well-populated topology featuring a distinctive Met-lined cavity in the holo MSM that, to our knowledge, is not represented in the current repertoire of published CaM structures.

A necessary consideration in the face of the observed conformational heterogeneity is the choice of force field. The Amber-99sb-ildn force field used in this study has been shown to have complex effects on stability, tending to destabilize helical structures but predicting overly compact non-native conformations on folding pathways because of poor solvation41,54,55. However, these studies also suggest that such systemic errors primarily affect simulations of small peptide systems, while good agreement is found between experiment and the ensembles of larger model systems41,56. Given that our models likewise reproduce experimental structural information, predict substantial heterogeneity in non-helical regions shown previously to be highly flexible, and are similar between different force fields, we do not expect force field choice to affect the conclusions drawn in this study.

Given the distinct kinetic and thermal properties of the two lobes of CaM9,40, it will be of interest to assess the extent to which conformational heterogeneity in N-CaM contributes to full CaM’s binding diversity. NMR experiments indicate that C-CaM is more dynamic than N-CaM9, so it is possible that the ensemble of binding interfaces of the former contributes most to binding diversity, while binding of the latter primarily enhances the affinity of this interaction. Another possibility is that N-CaM features a distinct ensemble of binding interfaces that could, in combination with the ensemble populated by C-CaM, be used as an opportunity to effect functional bifurcation, a phenomenon that has already been reported in CaM’s physiological function57,58. CaM plays a pivotal role in signal transduction pathways that are frequently perturbed in disease states25,26; therefore, a nuanced understanding of its binding modes could facilitate the rational targeting of this signalling hub for pharmacological intervention.

Methods

Simulation details for apo CaM

The starting structures were taken from the crystal structure of apo CaM (PDB: 1CFD (ref. 27)). The all-atom structures were then solvated in a cubic box with box length of 65 Å with TIP3P (ref. 59) water molecules such that water extended at least 10 Å away from the surface of the protein; 24 Na+ ions and 12 Cl ions were added to the system to neutralize the charge. Covalent bonds involving hydrogen atoms were constrained with the LINear Constraint Solver (LINCS)60 and particle mesh Ewald61 was used to treat long-range electrostatic interactions. The structures obtained after an initial equilibration for 1 ns at constant temperature and pressure and with constraints on the heavy atom positions were used as the starting conformation for the distributed MD simulations. Production MD simulations were carried out at constant temperature and pressure of 298 K and 1 atm respectively, with a time step of 2 fs. The Amber99sb-ildn (ref. 62) force field was used for protein and ions. Distributed MD simulations were performed using GROMACS63 on the Folding@home64 computing platform. For the first set of simulations, 5,000 simulations were started from the two structures (2,500 each) for a total aggregate simulation time of 200 μs. Adaptive sampling algorithms65 based on an MSM built from the initial round of sampling were used to improve the sampling the conformational landscape of apo CaM. Specifically, the initial structures (500 states with lowest population) for adaptive sampling rounds were chosen from the 2,000 state MSM built using RMSD metric and the simulation data from the first round. In all, we performed a total of 12,184 simulations with a total duration of 455 μs. Trajectory snapshots were recorded every 100 ps. The distribution of total number of simulations versus the simulation length for apo CaM is shown in Supplementary Fig. 16a.

Simulation details for holo-CaM

The starting structure for the holo-CaM was taken from PDB: 1CLL. The force field parameters for the Ca2+ ions were taken from the Amber99sb-ildn (ref. 62) force field. The all-atom structures were then solvated in a cubic box with box length of 60 Å with TIP3P water molecules, such that water extended at least 10 Å away from the surface of the protein; 12 Na+ ions and 4 Cl ions were added to the system to neutralize the charge. A total of 9,995 distributed MD simulations were performed using GROMACS on the Folding@home platform for a total duration of 256 μs. Simulation protocols employed for holo-CaM were similar to the protocols described above for the apo CaM. The distribution of total number of simulations versus the simulation length for holo-CaM is shown in Supplementary Fig. 16b. Preservation of Ca2+-binding site geometry was confirmed by calculating the RMSD of the Ca2+-ligating residues between the holo-crystal structure and each conformation weighted by its MSM probability (Supplementary Fig. 17).

MSM construction

The theoretical framework underlying MSMs has been detailed extensively elsewhere22 so will briefly be summarized herein. MSMs are an application of discrete-space master equations and describe the kinetic network that underlies a particular partitioning of phase space. The phase space explored by biomolecular systems can be discretized by dimensionality reduction along specific order parameters. In this study, we discretized phase space using tICA29, which is a variant of principal component analysis that defines a kinetically motivated, projection-based distance metric. The tICA method computes the time-lag correlation matrix, whose eigenvectors represent linear combinations of the most slowly decorrelating degrees of freedom in a system. To apply the tICA method to C-CaM, we chose the degrees of freedom to correspond to the minimum distances in the contact map between the heavy atoms of all residue pairs separated by at least three residues. Representing each conformation as a vector of its pairwise residue distances and projecting these vector representations onto tIC components (eigenvectors of the time-lag correlation matrix) mapped the system to a reduced dimensionality space. The distance between conformations |A〉 and |B〉 in this reduced tIC space was then calculated by

where PT|X denotes the projection of the vector representation of conformation X onto a d × N matrix P, whose columns are the N slowest tIC components and row-length is determined by the size of the contact map, and ||·||2 is the N-dimensional Euclidean norm. For apo- and holo-C-CaM, the time-lag correlation matrices were calculated with a delta time of 40 ns, and the reduced phase space was computed by projection onto the slowest 20 tICs and clustered into 100 states using the k-centres algorithm.

A kinetic network describing the rates of transitions in this partitioned space was then constructed by computing the transition probability matrix at variable lag-times. The transition probability matrix, T(τ), contains the probability of transitioning from state i to state j in the time interval τ, obtained by counting the number of transitions, nij, observed between time t and t+τ and then normalizing to the sum of all transitions from state i. Detailed balance is enforced by symmetrizing the transition probability matrix by the maximum likelihood estimate. The first eigenvector of the transition probability matrix corresponds to the equilibrium distribution; subsequent eigenvectors correspond to dynamic processes in the system. Their eigenvalues, μ, are related to the implied timescales, 1/k, of these dynamic processes:

Examining the slowest eigenvalues as a function of lag-time for convergence behaviour identifies a Markovian (memoryless) lag-time. MSMs were constructed at these lag-times (20 and 15 ns for apo- and holo-C-CaM, respectively). Transition Path Theory, which has been outlined elsewhere66, was used to determine the pathways and their associated fluxes between specific states. Conformational landscapes were generated by summing over the microstates of each model using the following equation:

where N is the number of microstates, πi is the equilibrium probability of state i and hi(x,y) is the normalized histogram of the order parameters x and y restricted to microstate i. MSMs were built using the MSMbuilder67 software.

Generation and analysis of MSM trajectories

Representative trajectories that recapitulate the MSM can be generated by applying a kinetic Monte Carlo sampling scheme to the transition probability matrix19. Given an MSM trajectory in state i at time t, the subsequent state visited at t+τ (a period of one lag-time) was determined by sampling from the multinomial distribution given by Ti. For each frame of the resulting trajectory, a random conformation belonging to the relevant state was selected and from this a time-series of the observables was calculated. For both apo- and holo-C-CaM, the trajectories were started from the most populated state at equilibrium.

Comparison with the following structures from the Protein Data Bank was made in Figs 3,6: 1CKK, 1CM1, 1G4Y, 1IQ5, 1IWQ, 1K90, 1NIW, 1NID, 1XA5, 1YRT, 2BBM, 2F3Y, 2HQW, 2KDU, 2M55, 2MG5, 2MGU, 2W73, 2YGG, 3BXL, 3G43, 3SJQ, 4BYF and 4EHQ. In each case, only residues E82-A147 from a single chain of CaM were analysed, and missing C-terminal residues were modelled in using the Crystallographic Object-Oriented Toolkit (Coot, v 0.7.1) (ref. 68) or Modeller, v 9.14 (ref. 69) where necessary.

Macrostate models of the most populated binding interfaces were built by using hierarchical clustering70 to partition each MSM trajectory into a five state model based on RMSD of the following residues that comprise the hydrophobic binding interface: Phe89, Phe92, Ile100, Leu105, Val108, Met109, Leu112, Leu116, Met124, Ala128, Val136, Phe141, Met144 and Met145.

Data availability

Simulation data for apo and holo C-CaM are available at the Stanford Data Repository: https://purl.stanford.edu/zw177zm0384.

Additional information

How to cite this article: Shukla, D. et al. Conformational heterogeneity of the calmodulin binding interface. Nat. Commun. 7:10910 doi: 10.1038/ncomms10910 (2016).