An optimal distance cutoff for contact-based Protein Structure Networks using side-chain centers of mass

Salamanca Viloria, Juan; Allega, Maria Francesca; Lambrughi, Matteo; Papaleo, Elena

doi:10.1038/s41598-017-01498-6

Download PDF

Article
Open access
Published: 06 June 2017

An optimal distance cutoff for contact-based Protein Structure Networks using side-chain centers of mass

Juan Salamanca Viloria¹,
Maria Francesca Allega¹,
Matteo Lambrughi¹ &
…
Elena Papaleo¹

Scientific Reports volume 7, Article number: 2838 (2017) Cite this article

6616 Accesses
57 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Proteins are highly dynamic entities attaining a myriad of different conformations. Protein side chains change their states during dynamics, causing clashes that are propagated at distal sites. A convenient formalism to analyze protein dynamics is based on network theory using Protein Structure Networks (PSNs). Despite their broad applicability, few efforts have been devoted to benchmarking PSN methods and to provide the community with best practices. In many applications, it is convenient to use the centers of mass of the side chains as nodes. It becomes thus critical to evaluate the minimal distance cutoff between the centers of mass which will provide stable network properties. Moreover, when the PSN is derived from a structural ensemble collected by molecular dynamics (MD), the impact of the MD force field has to be evaluated. We selected a dataset of proteins with different fold and size and assessed the two fundamental properties of the PSN, i.e. hubs and connected components. We identified an optimal cutoff of 5 Å that is robust to changes in the force field and the proteins. Our study builds solid foundations for the harmonization and standardization of the PSN approach.

Deciphering collaborative sidechain motions in proteins during molecular dynamics simulations

Article Open access 28 September 2020

Bruck Taddese, Antoine Garnier, … Marie Chabbert

StructureDistiller: Structural relevance scoring identifies the most informative entries of a contact map

Article Open access 06 December 2019

Sebastian Bittrich, Michael Schroeder & Dirk Labudde

Chasing coevolutionary signals in intrinsically disordered proteins complexes

Article Open access 21 October 2020

Javier A. Iserte, Tamas Lazar, … Cristina Marino-Buslje

Introduction

Proteins are complex and highly dynamic entities attaining a myriad of different conformations in solution^1,2,3,4,5 that are often related to the protein function. Indeed, they can resemble bound states to a biological partner^6,7,8,9,10, active states of enzymes^11,12,13,14, or conformations that are stabilized by a post-translational modification (PTM)^{6, 11}, as well as altered by a disease-related mutation¹⁵.

An interesting property of proteins is that a perturbation (e.g. a binding event, a mutation or a PTM) occurring at a certain site of the structure can be transmitted over long distances to another location^16,17,18,19. This long-range communication is often related to allostery and may affect critical distal sites for protein function.

At the atom-level, the perturbation from one protein site to a distal one can be propagated by a cascade of collisional clashes between residue side chains, which undergo changes of their rotameric states during protein dynamics^{19, 20}. Local rearrangements occurring in the intramolecular contacts during the protein dynamics are thus at the base of this long-range communication¹⁹.

A convenient formalism to unravel the complexity behind long-range structural communication in proteins is the application of network theory to protein structure, i.e. the so-called Protein Structure Networks (PSNs). In a PSN, the protein residues become the nodes of the network connected by edges which can, for example, be described as the contact strength between each pair of residues^{20,21,22,23,24,25,26,27,28,29,30}. Networks indeed are proper tools to link the local to global perturbations occurring during protein dynamics since they are by definition mediators of communication from local to global scales¹⁹.

Nowadays, PSN-based strategies are very popular and used in structural biology, and a plethora of different methodologies has been proposed^{25,26,27,28, 31,32,33,34,35,36,37}. PSN approaches are often integrated to the dynamic description of proteins that all-atom molecular dynamics (MD) simulations or other sampling methods provide^{21, 31, 38,39,40,41,42,43,44,45}.

Despite their broad applicability, few efforts have been devoted so far to the benchmarking of PSN and PSN-MD methods, to define best practices in the field and to ultimately provide the community with clear rules to determine PSN optimal parameters. The definition of arbitrary cutoffs is one of the major weaknesses of contact-based networks applied to protein structure and dynamics^{46, 47}. As previously shown, many options are available to select suitable distance cutoffs for the prediction of residue contacts in protein structures⁴⁷. Alternative solutions exist, i.e. using different principles for edge and weight definition such as energies or correlated motions. Nevertheless, a contact-based approach is still valuable especially if we consider the major advances that techniques such as atomistic biomolecular simulations have achieved in the last decade^{48, 49}. Indeed, MD simulations have now reached high accuracy in describing conformational changes even at the side-chain levels and occurring on different time scales, as attested by the agreement with experimental observables^{4, 50,51,52,53}.

In many PSN-MD applications, it is convenient to use the centers of mass of residue side chains as PSN nodes, the distance between the centers of mass for edge definition and their occurrence as weight^{20, 31, 41, 54, 55}. It becomes, thus, critical to evaluate the minimal distance cutoff between the centers of mass of two residues to include an edge in the PSN. Moreover, when the PSN is derived from a structural ensemble collected by MD simulations and not from experimental structures, it is mandatory to evaluate the impact of the physical model (i.e. force field) on the PSN parameters.

We selected a dataset of proteins with different architecture and size and assessed the distribution of the two fundamental properties of a PSN, i.e. the hubs and the connected components. We also evaluated the influence of the force field selection on the PSN parameters, and we propose an optimal distance cutoff for PSN based on distances between the centers of mass of protein residues. The cutoff here identified is robust independently on the protein size, fold, and the MD force field employed. Our study builds strong foundations toward the harmonization and standardization of PSN strategies and a framework to apply also more generally to the choice of parameters for other PSN-based approaches.

Results and Discussion

Selected protein structures for PSN-MD analyses

We selected four different three-dimensional (3D) structures of monomeric proteins of various size and fold (Fig. 1) and four different force fields (Table S1). In particular, we chose state-of-the-art physical models from each of the most used force-field families for MD simulations of proteins, i.e. CHARMM (CHARMM22*⁵⁶ and CHARMM36⁵⁷), AMBER (Amber99SB*-ILDN^{58, 59}) and GROMOS (GROMOS54a7⁶⁰). We carried out the MD simulations in explicit solvent for one μs so that they could reflect the MD sampling that is employed for PSN-MD studies^{40, 54}. For each MD ensemble, PSN based on distances between the side-chain centers of mass have been calculated as detailed in the Materials and Methods.

A distance cutoff of 5 Å allows a robust description of PSN properties independently on the protein and the MD force field employed

The choice of the distance cutoff is essential for the PSN definition. Indeed, the distance cutoff is used to discriminate which contact between two side chains has to be included or not as a link of the network, ultimately affecting the network topology. When the distance is calculated between the centers of mass of the residue side chains, the choice of the cutoff becomes even more critical. Indeed, we cannot arbitrarily assume that the distances commonly used in structural biology to define an interaction between two amino acids - such as 4 or 4.5 Å - are valid. The issue becomes even more cogent when a PSN is derived by an MD ensemble where each force field relies on different atomic masses.

The two most important properties of a PSN, which ultimately dictate how distant regions of the PSN are linked are the so-called hubs and connected components (also known as clusters of nodes) (Fig. 2).

Hubs are nodes that have a high degree of connectivity in a network. The highest degree of residue hubs is limited by steric constraints and it could vary from three to ten in PSN²⁷. Protein structures are known to be made up of a significant number of strongly and weakly interacting residue hubs that stabilize the tertiary structure of the protein and provide resilience against random mutations^{19, 27}.

A robust PSN should feature a certain amount of hub residues that have at least a node degree of three (i.e. connected with three or more other nodes by an edge in the PSN) and it should be composed of multiple connected components which are not too fragmented. Cluster fragmentation is particularly critical in the PSN definition. Other colleagues and we showed that central parameters that influence the size of the connected components are the p _crit ^{31, 42} or I _crit ^{40, 61, 62}, depending on the methods used for PSN construction. Indeed, edges that have extremely low weights would increase the noise and connect all the clusters into a single one. Conversely, if only high weights are retained, only sparsely populated and highly fragmented clusters will be observed with a minimal number of communication paths between distal regions.

In a PSN approach based on side-chain-side-chain contacts, the distance cutoffs used can affect the network in a similar way. Indeed, if a distance that is too short and restrictive is chosen, the network will appear as very fragmented with small separated clusters and few or virtually no hubs. If the distance is too long, each residue of the network will be connected, resulting in a single cluster that embraces the entire network. It is thus critical to find an optimal distance cutoff.

Moreover, since the PSN-MD approaches, as the one here employed, generally rely on extracting an average and static PSN from an MD trajectory, it becomes fundamental to assess the convergence of hubs and connected components over the simulation time.

We thus here evaluated: (i) the convergence of hubs and connected components in PSN derived by MD simulations using a Jackknife approach (see Materials and Methods) and (ii) the distribution of hubs and connected components at different distance cutoffs (Figs 3 and 4, Fig. S1). (iii) In the attempt of harmonizing the PSN protocol and allowing the reproducibility of the analyses, we also implemented a Python-based pipeline (PyInKnife.py) to automatize the steps described above, which can be used free of charge (see Materials and Methods for details).

At first, we evaluated whether hubs and connected components are stable properties in the MD ensembles here collected (Figs 3 and 4, Fig. S1). With regards to the distance cutoff, we identified common trends in the hubs and connected components distribution independently from the protein under investigation and the force field employed in the simulations. Indeed, in all the cases distance cutoffs lower than 5 Å resulted in a minimal number of hubs (less than four hub residues) where the connection degree was smaller than three (Fig. 3). On the contrary, distance cutoffs higher than 5 Å showed only one large cluster accounting for most of the protein residues (Fig. 4), indicating that this value is the more appropriate cutoff to employ for a PSN-MD where the contacts are calculated as distances between the centers of mass of residue side chains.

Localization of hubs and connected components on the 3D structure is conserved using the 5 Å distance cutoff

The 5 Å distance cutoff allows for similar general features of the PSN of the same protein described by different force fields (Figs 3 and 4 ). Despite this result is encouraging, we need to take into consideration that PSNs are employed to achieve residue-level details in structural biology. PSNs are used to identify the localization of the hub residues, the specific residues that belong to the same cluster or even the paths of communication between distal residues and their intermediate nodes. These are all important PSN properties that can, for example, be altered by interactions with biological partners^{6, 40, 63} or mutations^{21, 40, 42, 51, 64}. It is thus not enough to observe that the PSN description is robust regarding the overall distribution of hubs and connected components. Indeed, the PSNs collected for the same protein, but using different MD force fields, with the 5 Å distance cutoff might differ in the localization of hub residues in the 3D structure or in the individual residues that belong to the same cluster without affecting the total number of hubs and connected components. The same observation holds for the localization of hubs and connected components when the entire MD trajectory is compared to the resampled MD trajectories collected from the Jackknife approach.

We thus compared the hubs and connected components at the residue-level as derived by the PSN analyses of the entire MD trajectories or of the resampled MD trajectories obtained with the Jackknife procedure (see Materials and Methods). The analyses showed a reasonable convergence of hubs and connected components also at the residue-level with only minor discrepancies among the PSN calculated from the entire MD trajectory and few of the resampled trajectories (Figs S2 and S3).

Moreover, we analyzed the hub localization and their degree in the MD simulations of CypA where different force fields have been used (Figs 5 and 6A). We noticed that the localization of the hubs appears to be equally distributed on the 3D structure coming from different force fields, apart from minor changes in their node degree. Similar results were obtained for Trx using CHARMM22* and GROMOS54a7 force fields.

In parallel, we also mapped the first five more populated connected components onto the CypA sequence and 3D structure (Figs 6B and 7). The composition and distribution of the clusters are different only in CHARMM36 simulations. This apparent difference is only due to a splitting of the connected component number 1 in three smaller clusters, as well as to a different localization of the 5^th cluster (i.e. the smallest one). Only subtle differences have been observed for Amber99SB*-ILDN and CHARMM22*, suggesting a robust description of the connected components with these two force fields, as also found in a recent PSN study of a dimer⁵⁴.

Conclusions

In the protein world, a perturbation occurring at a certain site of the protein structure can be transmitted over long distances to another site. These structural rearrangements can be propagated by a cascade of changes in the conformational states of the residue side chains. Local changes occurring in the residue-residue contacts during the protein dynamics are thus at the base of this long-range communication. Network theory is a suitable formalism to evoke to analyze protein structures and to identify the paths of residues that can transmit the structural changes over long distances. In this context, a plethora of different approaches to define a PSN has been developed, often integrated with molecular dynamics simulations to account for the protein dynamics.

Despite the broad application of these methods, the community is missing clear rules and a solid framework to define the PSN parameters. It becomes thus critical to evaluate the minimal distance cutoff that can be used to include an edge in the PSN and that provides stable network properties, as well as the influence of the physical model used to describe the protein in the simulations.

Indeed, there are not consolidated and uniform protocols in the PSN-MD field, especially when the edges are defined according to the distance between the centers of mass of protein side chains. Moreover, most of the PSN approaches have been optimized using datasets of static experimental structures from the Protein Data Bank. A careful evaluation of the PSN parameters in an MD ensemble of structures has been poorly applied. PSN parameters that are optimal for the network analyses of experimental crystallographic structures are not necessarily suitable for the analysis of an MD ensemble, as recently pointed out⁴⁰. Most of the publications in which a PSN was calculated using the PyInteraph suite of tools, for example, employ very different distance cutoffs.

We thus selected a dataset of proteins to use as model systems to assess important PSN properties as a function of different distance cutoffs and physical models. In particular, we focused on two fundamental properties of the PSN, i.e. the hubs and the connected components. We identified an optimal value for the distance cutoff (5 Å) that is robust to changes in the MD force field and applicable to proteins with different sizes or folds. Our study provides a general framework to select PSN parameters and to improve reproducibility of the results thanks to a free-of-charge Python-based pipeline, PyInKnife. We here built the foundations toward the harmonization and standardization of the PSN-MD approach.

Materials and Methods

Molecular dynamics simulations

We performed explicit solvent MD simulations using the GROMACS software version 4.6⁶⁶ with different force fields and solvent models. A summary of the starting structures, protein size, force fields and solvent models used in this study is reported in Table S1. The MD simulation of Dri ARID domain has been published before⁴⁰ and here employed for the analyses. 500-ns simulations of CypA have been published before⁵¹ and we here elongated them to achieve one μs of sampling. We collected the remaining simulations for the first time in this study at 300 K and 1 bar in the NVT ensemble with 150 mM of NaCl. We employed periodic boundary conditions and we set a distance equal or greater than 1.8 Å from the protein atoms and the box edges of a dodecahedral box of water molecules. Preparation steps have been carried out according to a protocol recently applied to other proteins⁶⁷. We applied a 2-fs time step and the LINCS algorithm⁶⁸, as well as the Particle-Mesh Ewald (PME) summation scheme⁶⁹ to treat long-range electrostatic interactions. Van der Waals and short-range Coulomb interactions were truncated at 9 Å and conformations stored every 10 ps. We carried out productive MD simulations for one μs.

We calculated the minimal distance between each protein and its image to rule out artifacts due to periodic boundary conditions and artificial contacts between the protein and the corresponding image.

PSN definition

We used the PyInteraph suite of tools³¹ to construct a PSN-MD based on side-chain contacts using all the residues except for glycines. The contacts are defined as distances between the centers of mass of side chains on the base of the atomic mass files provided by PyInteraph. Different distance cutoffs have been assessed in this study in the range of 4–6 Å (see below) to include a certain contact as edge of the network. Moreover, to derive a weighted network, the persistence of the contact in each MD ensemble was measured and a p _crit of 20% was employed to filter out meaningless interactions and to maintain the network structure, in agreement with previous applications of the same method^{31, 42, 70}. We also used the xPyder plugin⁶⁵ for Pymol to map on the 3D structure the PSN connected components.

The PyInKnife pipeline

We developed a Python-based pipeline (which is available free of charge at https://github.com/ELELAB/PyInKnife) called PyInKnife in order to: (i) automatize the pre-processing of the trajectories for PSN analyses, (ii) sub-set the trajectories in shorter trajectory files that retain 90% of the frames (see below), (iii) run the different steps of PyInteraph on each trajectory subset and using different distance cutoffs, including the creation of the PSN, calculation of hubs and connected components and their distribution, and (iv) generate a final report with publication-ready plots and figures. The pipeline is illustrated in Fig. 8.

PyInKnife requires the pre-processing of the MD trajectory to remove artefacts due to the periodic boundary conditions and to extract a reference structure along with the topology required for the PSN calculations. The pre-processing is carried out by three different GROMACS tools (www.Gromacs.org): make_ndx, trjconv and editconf. These tools allow us to generate the index file, convert and manipulate the trajectories and structures, respectively.

PyInKnife can be also used on trajectories obtained with other simulation packages, such as Amber, CHARMM and NAMD after conversion of the MD trajectory to the GROMACS format (.xtc or.trr file). This can be achieved with several tools such as WORDOM ⁷¹, the MDAnalysis package⁷² and the Catdcd plugin (http://www.ks.uiuc.edu/Development/MDTools/catdcd/). The user can employ the GROMACS tool editconf to convert the PDB file of the starting structure, or one frame extracted from the trajectory, into the file format required by PyInteraph (GROMACS.gro file).

PyInKnife allows to automatize the analyses of contact-based PSN, hydrophobic interactions, and hydrogen bond networks implemented in PyInteraph. The user can specify from the command line the PyInteraph atomic mass databases, the distance cutoff values to be tested and other PSN parameters.

After the PSN for each MD trajectory is obtained, it is possible to calculate with PyInKnife the hubs and connected components for each class of interactions by using the graph_analysis tool of the PyInteraph suite.

PyInKnife also implements a pipeline to evaluate the convergence of the two most important PSN properties, i.e. hubs and connected components in the MD trajectory. We used the Jackknife resampling method⁷³ to calculate the deviation from resampled trajectories where a 10% has been discarded at regular intervals of the simulation frames. The resampled trajectories are calculated using the GROMACS tool trjcat. The procedure is illustrated in Fig. 9.

PyInKnife also includes R-based scripts to plot the results and produce publication-ready figures. To use the plotting R scripts, the R packages ggplot, ggplot2 and lattice are required.

The Jackknife standard error is calculated as

$$SE{(\hat{x})}_{jack}={\{\frac{n-1}{n}\sum _{i=1}^{n}{({\hat{\theta }}_{(i)}-{\hat{\theta }}_{(.)})}^{2}\}}^{1/2}$$

where n is the number of resampled trajectories (10 as default), $\hat{\theta }$ is the estimator of the ith resampled trajectory and ${\hat{\theta }}_{(.)}$ is the empirical average of the estimator on the resampled trajectories

$${\hat{\theta }}_{(.)}=\frac{1}{n}\sum _{i=1}^{n}{\hat{\theta }}_{(i)}$$

References

Karplus, M. & Kuriyan, J. Molecular dynamics and protein function. Proc. Natl. Acad. Sci. USA 102, 6679–85, doi:10.1073/pnas.0408930102 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar
Klepeis, J. L., Lindorff-Larsen, K., Dror, R. O. & Shaw, D. E. Long-timescale molecular dynamics simulations of protein structure and function. Curr. Opin. Struct. Biol. 19, 120–7, doi:10.1016/j.sbi.2009.03.004 (2009).
Article CAS PubMed Google Scholar
Grant, B. J., Gorfe, A. A. & McCammon, J. A. Large conformational changes in proteins: Signaling and other functions. Curr. Opin. Struct. Biol. 20, 142–147, doi:10.1016/j.sbi.2009.12.004 (2010).
Article CAS PubMed PubMed Central Google Scholar
Lindorff-Larsen, K. et al. Systematic validation of protein force fields against experimental data. PLoS One 7, e32131, doi:10.1371/journal.pone.0032131 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Kovermann, M., Rogne, P. & Wolf-Watz, M. Protein dynamics and function from solution state NMR spectroscopy. Q. Rev. Biophys. 49, e6, doi:10.1017/S0033583516000019 (2016).
Article PubMed Google Scholar
Lambrughi, M. et al. DNA-binding protects p53 from interactions with cofactors involved in transcription-independent functions. Nucleic Acids Res. 44, 9096–9109, doi:10.1093/nar/gkw770 (2016).
CAS PubMed PubMed Central Google Scholar
Boehr, D. D., Nussinov, R. & Wright, P. E. The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 5, 789–96, doi:10.1038/nchembio.232 (2009).
Article CAS PubMed PubMed Central Google Scholar
Baldwin, A. J. & Kay, L. E. NMR spectroscopy brings invisible protein states into focus. Nat. Chem. Biol. 5, 808–814, doi:10.1038/nchembio.238 (2009).
Article CAS PubMed Google Scholar
Masterson, L. R. et al. Dynamics connect substrate recognition to catalysis in protein kinase A. Nat. Chem. Biol. 6, 821–8, doi:10.1038/nchembio.452 (2010).
Article CAS PubMed PubMed Central Google Scholar
Sumbul, F., Acuner-Ozbabacan, S. E. & Haliloglu, T. Allosteric Dynamic Control of Binding. Biophys. J. 109, 1190–1201, doi:10.1016/j.bpj.2015.08.011 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Papaleo, E. et al. An Acidic Loop and Cognate Phosphorylation Sites Define a Molecular Switch That Modulates Ubiquitin Charging Activity in Cdc34-Like Enzymes. PLoS Comput. Biol. 7 (2011).
Papaleo, E. et al. Loop 7 of E2 Enzymes: An Ancestral Conserved Functional Motif Involved in the E2-Mediated Steps of the Ubiquitination Cascade. PLoS One 7 (2012).
Campbell, E. et al. Changes in protein dynamics optimize the active site during evolution of new enzyme function. Nat. Chem. Biol. 12, 944–950, doi:10.1038/nchembio.2175 (2016).
Article CAS PubMed Google Scholar
Ma, B. & Nussinov, R. Enzyme dynamics point to stepwise conformational selection in catalysis. Curr. Opin. Chem. Biol. 14, 652–9, doi:10.1016/j.cbpa.2010.08.012 (2010).
Article CAS PubMed Google Scholar
Demir, Ö. et al. Ensemble-based computational approach discriminates functional activity of p53 cancer and rescue mutants. PLoS Comput. Biol. 7 (2011).
Guo, J. & Zhou, H. X. Protein Allostery and Conformational Dynamics. Chem. Rev. 116, 6503–6515, doi:10.1021/acs.chemrev.5b00590 (2016).
Article CAS PubMed Google Scholar
Ribeiro, A. A. S. T. & Ortiz, V. A Chemical Perspective on Allostery. Chem. Rev. 116, 6488–6502, doi:10.1021/acs.chemrev.5b00543 (2016).
Article CAS PubMed Google Scholar
Papaleo, E. et al. The Role of Protein Loops and Linkers in Conformational Dynamics and Allostery. Chem. Rev. 116, 6391–6423, doi:10.1021/acs.chemrev.5b00623 (2016).
Article CAS PubMed Google Scholar
Vuillon, L. & Lesieur, C. From local to global changes in proteins: a network view. Curr. Opin. Struct. Biol. 31, 1–8, doi:10.1016/j.sbi.2015.02.015 (2015).
Article CAS PubMed Google Scholar
Papaleo, E. Integrating atomistic molecular dynamics simulations, experiments, and network analysis to study protein dynamics: strength in unity. Front. Mol. Biosci. 2, 28, doi:10.3389/fmolb.2015.00028 (2015).
Article PubMed PubMed Central Google Scholar
Angelova, K. et al. Conserved amino acids participate in the structure networks deputed to intramolecular communication in the lutropin receptor. Cell. Mol. Life Sci. 68, 1227–39, doi:10.1007/s00018-010-0519-z (2011).
Article CAS PubMed Google Scholar
Di Paola, L., De Ruvo, M., Paci, P., Santoni, D. & Giuliani, A. Protein Contact Networks: An Emerging Paradigm in Chemistry. Chem. Rev. 113, 1598–1613, doi:10.1021/cr3002356 (2013).
Article PubMed Google Scholar
Di Paola, L. & Giuliani, A. Protein contact network topology: a natural language for allostery. Curr. Opin. Struct. Biol. 31, 43–8, doi:10.1016/j.sbi.2015.03.001 (2015).
Article PubMed Google Scholar
Cheng, S., Fu, H. & Cui, D.-X. Characteristics Analyses and Comparisons of the Protein Structure Networks Constructed by Different Methods. Interdiscip. Sci Comput Life Sci 8, 65–74, doi:10.1007/s12539-015-0106-y (2016).
Article CAS Google Scholar
O’Rourke, K. F., Gorman, S. D. & Boehr, D. D. Biophysical and computational methods to analyze amino acid interaction networks in proteins. Comput. Struct. Biotechnol. J. 14, 245–251, doi:10.1016/j.csbj.2016.06.002 (2016).
Article PubMed PubMed Central Google Scholar
Feher, V. A., Durrant, J. D., Van Wart, A. T. & Amaro, R. E. Computational approaches to mapping allosteric pathways. Curr. Opin. Struct. Biol. 25, 98–103, doi:10.1016/j.sbi.2014.02.004 (2014).
Article CAS PubMed Google Scholar
Bhattacharyya, M., Ghosh, S. & Vishveshwara, S. Protein Structure and Function: Looking through the Network of Side-Chain Interactions. Curr. Protein Pept. Sci. 17, 4–25, doi:10.2174/1389203716666150923105727 (2016).
Article CAS PubMed Google Scholar
van den Bedem, H., Bhabha, G., Yang, K., Wright, P. E. & Fraser, J. S. Automated identification of functional dynamic contact networks from X-ray crystallography. Nat. Methods 10, 896–902, doi:10.1038/nmeth.2592 (2013).
Article PubMed PubMed Central Google Scholar
Csermely, P., Nussinov, R. & Szilágyi, A. From allosteric drugs to allo-network drugs: state of the art and trends of design, synthesis and computational methods. Curr. Top. Med. Chem. 13, 2–4, doi:10.2174/1568026611313010002 (2013).
Article CAS PubMed Google Scholar
Csermely, P., Korcsmáros, T., Kiss, H. J. M., London, G. & Nussinov, R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol. Ther. 138, 333–408, doi:10.1016/j.pharmthera.2013.01.016 (2013).
Article CAS PubMed PubMed Central Google Scholar
Tiberti, M. et al. PyInteraph: A Framework for the Analysis of Interaction Networks in Structural Ensembles of Proteins. J Chem Inf Model 54, 1537–1551, doi:10.1021/ci400639r (2014).
Article CAS PubMed Google Scholar
Van Wart, A. T., Durrant, J., Votapka, L. & Amaro, R. E. Weighted implementation of suboptimal paths (WISP): An optimized algorithm and tool for dynamical network analysis. J. Chem. Theory Comput. 10, 511–517, doi:10.1021/ct4008603 (2014).
Article PubMed PubMed Central Google Scholar
Chakrabarty, B. & Parekh, N. NAPS: Network analysis of protein structures. Nucleic Acids Res. 44, W375–W382, doi:10.1093/nar/gkw383 (2016).
Article PubMed PubMed Central Google Scholar
Seeber, M., Felline, A., Raimondi, F., Mariani, S. & Fanelli, F. WebPSN: A web server for high-throughput investigation of structural communication in biomacromolecules. Bioinformatics 31, 779–781, doi:10.1093/bioinformatics/btu718 (2015).
Article CAS PubMed Google Scholar
Stolzenberg, S., Michino, M., Levine, M. V., Weinstein, H. & Shi, L. Computational approaches to detect allosteric pathways in transmembrane molecular machines. Biochim. Biophys. Acta - Biomembr. 1858, 1652–1662, doi:10.1016/j.bbamem.2016.01.010 (2016).
Article CAS Google Scholar
Nepomnyachiy, S., Ben-Tal, N. & Kolodny, R. CyToStruct: Augmenting the network visualization of CyToStruct with the power of molecular viewers. Structure 23, 941–948, doi:10.1016/j.str.2015.02.013 (2015).
Article CAS PubMed Google Scholar
Niknam, N., Khakzad, H., Arab, S. S. & Naderi-Manesh, H. PDB2Graph: A toolbox for identifying critical amino acids map in proteins based on graph theory. Comput. Biol. Med. 72, 151–159, doi:10.1016/j.compbiomed.2016.03.012 (2016).
Article CAS PubMed Google Scholar
Ghosh, A. & Vishveshwara, S. A study of communication pathways in methionyl- tRNA synthetase by molecular dynamics simulations and structure network analysis. Proc. Natl. Acad. Sci. USA 104, 15711–6, doi:10.1073/pnas.0704459104 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Karami, Y., Laine, E. & Carbone, A. Dissecting protein architecture with communication blocks and communicating segment pairs. BMC Bioinformatics 17, 13, doi:10.1186/s12859-015-0855-y (2016).
Article PubMed PubMed Central Google Scholar
Invernizzi, G., Tiberti, M., Lambrughi, M., Lindorff-Larsen, K. & Papaleo, E. Communication Routes in ARID Domains between Distal Residues in Helix 5 and the DNA-Binding Loops. PLoS Comput. Biol. 10, e1003744, doi:10.1371/journal.pcbi.1003744 (2014).
Article ADS PubMed PubMed Central Google Scholar
Marino, V. & Dell’Orco, D. Allosteric communication pathways routed by Ca2+/Mg2+ exchange in GCAP1 selectively switch target regulation modes. Sci. Rep. 6, 34277, doi:10.1038/srep34277 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Papaleo, E., Renzetti, G. & Tiberti, M. Mechanisms of intramolecular communication in a hyperthermophilic acylaminoacyl peptidase: a molecular dynamics investigation. PLoS One 7, e35686, doi:10.1371/journal.pone.0035686 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Skjærven, L., Yao, X.-Q., Scarabelli, G. & Grant, B. J. Integrating protein structural dynamics and evolutionary analysis with Bio3D. BMC Bioinformatics 15, 399, doi:10.1186/s12859-014-0399-6 (2014).
Article PubMed PubMed Central Google Scholar
Ribeiro, A. A. S. T. & Ortiz, V. MDN: A Web Portal for Network Analysis of Molecular Dynamics Simulations. Biophys. J. 109, 1110–1116, doi:10.1016/j.bpj.2015.06.013 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Ribeiro, A. A. S. T. & Ortiz, V. Energy propagation and network energetic coupling in proteins. J. Phys. Chem. A 119, 1835–1846, doi:10.1021/jp509906m (2015).
Article CAS Google Scholar
Ribeiro, A. A. S. T. & Ortiz, V. Determination of signaling pathways in proteins through network theory: Importance of the topology. J. Chem. Theory Comput. 10, 1762–1769, doi:10.1021/ct400977r (2014).
Article CAS PubMed Google Scholar
Da Silveira, C. H. et al. Protein cutoff scanning: A comparative analysis of cutoff dependent and cutoff free methods for prospecting contacts in proteins. Proteins Struct. Funct. Bioinforma. 74, 727–743, doi:10.1002/prot.v74:3 (2009).
Article Google Scholar
Hertig, S., Latorraca, N. R. & Dror, R. O. Revealing Atomic-Level Mechanisms of Protein Allostery with Molecular Dynamics Simulations. PLoS Comput. Biol. 12, 1–16, doi:10.1371/journal.pcbi.1004746 (2016).
Article Google Scholar
Dror, R. O., Dirks, R. M., Grossman, J. P., Xu, H. & Shaw, D. E. Biomolecular simulation: a computational microscope for molecular biology. Annu. Rev. Biophys. 41, 429–52, doi:10.1146/annurev-biophys-042910-155245 (2012).
Article CAS PubMed Google Scholar
Martín-García, F., Papaleo, E., Gomez-Puertas, P., Boomsma, W. & Lindorff-Larsen, K. Comparing Molecular Dynamics Force Fields in the Essential Subspace. PLoS One 10, e0121114, doi:10.1371/journal.pone.0121114 (2015).
Article PubMed PubMed Central Google Scholar
Papaleo, E., Sutto, L., Gervasio, F. L. & Lindorff-Larsen, K. Conformational Changes and Free Energies in a Proline Isomerase. J. Chem. Theory Comput. 10, 4169–4174, doi:10.1021/ct500536r (2014).
Article CAS PubMed Google Scholar
Wang, Y., Papaleo, E. & Lindorff-Larsen, K. Mapping transiently formed and sparsely populated conformations on a complex energy landscape. Elife 5, e17505, doi:10.7554/eLife.17505 (2016).
PubMed PubMed Central Google Scholar
Lindorff-Larsen, K., Maragakis, P., Piana, S. & Shaw, D. E. Picosecond to Millisecond Structural Dynamics in Human Ubiquitin. J. Phys. Chem. B acs.jpcb.6b02024, doi:10.1021/acs.jpcb.6b02024 (2016).
Nygaard, M. et al. The mutational landscape of the oncogenic MZF1 SCAN domain in cancer. Front. Mol. Biosci. doi:10.3389/fmolb.2016.00078 (2016).
Marino, V., Scholten, A., Koch, K. W. & Dell’Orco, D. Two retinal dystrophy-associated missense mutations in GUCA1A with distinct molecular properties result in a similar aberrant regulation of the retinal guanylate cyclase. Hum. Mol. Genet. 24, 6653–6666, doi:10.1093/hmg/ddv370 (2015).
Article CAS PubMed Google Scholar
Piana, S., Lindorff-Larsen, K. & Shaw, D. E. How robust are protein folding simulations with respect to force field parameterization? Biophys. J. 100, L47–9, doi:10.1016/j.bpj.2011.03.051 (2011).
Article CAS PubMed PubMed Central Google Scholar
Best, R. B. et al. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ1 and χ2 Dihedral Angles. J. Chem. Theory Comput. 8, 3257–3273, doi:10.1021/ct300400x (2012).
Article CAS PubMed PubMed Central Google Scholar
Best, R. B. & Hummer, G. Optimized molecular dynamics force fields applied to the helix-coil transition of polypeptides. J. Phys. Chem. B 113, 9004–15, doi:10.1021/jp901540t (2009).
Article CAS PubMed PubMed Central Google Scholar
Lindorff-Larsen, K. et al. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 78, 1950–8, doi:10.1002/prot.22711 (2010).
CAS PubMed PubMed Central Google Scholar
Schmidt, C., Beilsten-Edmands, V. & Robinson, C. V. Insights into Eukaryotic Translation Initiation from Mass Spectrometry of Macromolecular Protein Assemblies. J Mol. Biol 1–13, doi:10.1016/j.jmb.2015.10.011 (2015).
Brinda, K. V. & Vishveshwara, S. A network representation of protein structures: implications for protein stability. Biophys. J. 89, 4159–70, doi:10.1529/biophysj.105.064485 (2005).
Article CAS PubMed PubMed Central Google Scholar
Kannan, N. & Vishveshwara, S. Identification of side-chain clusters in protein structures by a graph spectral method. J. Mol. Biol. 292, 441–64, doi:10.1006/jmbi.1999.3058 (1999).
Article CAS PubMed Google Scholar
Stetz, G. & Verkhivker, G. M. Probing Allosteric Inhibition Mechanisms of the Hsp70 Chaperone Proteins Using Molecular Dynamics Simulations and Analysis of the Residue Interaction Networks. J. Chem. Inf. Model. 56, 1490–1517, doi:10.1021/acs.jcim.5b00755 (2016).
Article CAS PubMed Google Scholar
Mariani, S., Dell’Orco, D., Felline, A., Raimondi, F. & Fanelli, F. Network and Atomistic Simulations Unveil the Structural Determinants of Mutations Linked to Retinal Diseases. PLoS Comput. Biol. 9 (2013).
Pasi, M., Tiberti, M., Arrigoni, A. & Papaleo, E. xPyder: A PyMOL Plugin To Analyze Coupled Residues and Their Networks in Protein Structures. J. Chem. Inf. Model. 279, 1–6, doi:10.1021/ci300213c (2012).
Google Scholar
Hess, B., Kutzner, C., van der Spoel, D. & Lindahl, E. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 4, 435–447, doi:10.1021/ct700301q (2008).
Article CAS PubMed Google Scholar
Tiberti, M., Invernizzi, G. & Papaleo, E. (Dis) similarity Index To Compare Correlated Motions in Molecular Simulations. J. Chem. Theory Comput. 11, 4404–14, doi:10.1021/acs.jctc.5b00512 (2015).
Article CAS PubMed Google Scholar
Hess, B., Bekker, H., Berendsen, H. & Fraaije, J. LINCS: A linear constraint solver for molecular simulations. J Comput Chem 12, 1463–1472 (1993).
Google Scholar
Essmann, U. et al. A smooth particle mesh Ewald method. J. Chem. Phys. 103, 8577–8593, doi:10.1063/1.470117 (1995).
Article ADS CAS Google Scholar
Papaleo, E., Pasi, M., Tiberti, M. & De Gioia, L. Molecular dynamics of mesophilic-like mutants of a cold-adapted enzyme: insights into distal effects induced by the mutations. PLoS One 6, e24214, doi:10.1371/journal.pone.0024214 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Seeber, M. et al. Wordom: a user-friendly program for the analysis of molecular structures, trajectories, and free energy surfaces. J. Comput. Chem. 32, 1183–94, doi:10.1002/jcc.21688 (2011).
Article CAS PubMed Google Scholar
Michaud-Agrawal, N., Denning, E. J., Woolf, T. B. & Beckstein, O. MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations. J Comput Chem 32, 2319–2327, doi:10.1002/jcc.21787 (2011).
Article CAS PubMed PubMed Central Google Scholar
Miller, R. G. The jackknife-a review. Biometrika 61, 1–15 (1974).
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank Matteo Tiberti and Wouter Boomsma for fruitful discussion and suggestions.

Author information

Authors and Affiliations

Computational Biology Laboratory, Danish Cancer Society Research Center, Strandboulevarden 49, 2100, Copenhagen, Denmark
Juan Salamanca Viloria, Maria Francesca Allega, Matteo Lambrughi & Elena Papaleo

Authors

Juan Salamanca Viloria
View author publications
You can also search for this author in PubMed Google Scholar
Maria Francesca Allega
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Lambrughi
View author publications
You can also search for this author in PubMed Google Scholar
Elena Papaleo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.P. conceived and designed the research; J.S.V. and M.F.A. carried out the experiments; E.P., J.S.V. M.L., and M.F.A. discussed the data; J.S.V. and M.F.A prepared the figures and tables, E.P. wrote the manuscript with inputs from all the coauthors.

Corresponding author

Correspondence to Elena Papaleo.

Ethics declarations

Competing Interests

The authors declare that they have no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Salamanca Viloria, J., Allega, M.F., Lambrughi, M. et al. An optimal distance cutoff for contact-based Protein Structure Networks using side-chain centers of mass. Sci Rep 7, 2838 (2017). https://doi.org/10.1038/s41598-017-01498-6

Download citation

Received: 23 January 2017
Accepted: 28 March 2017
Published: 06 June 2017
DOI: https://doi.org/10.1038/s41598-017-01498-6

This article is cited by

Structural determination of a full-length plant cellulose synthase informed by experimental and in silico methods
- Albert L. Kwansa
- Abhishek Singh
- Yaroslava G. Yingling
Cellulose (2024)
Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design
- Balasubramanian Harihar
- Konda Mani Saravanan
- Samuel Selvaraj
Molecular Biotechnology (2024)
The ClusPro AbEMap web server for the prediction of antibody epitopes
- Israel T. Desta
- Sergei Kotelnikov
- Dima Kozakov
Nature Protocols (2023)
SARS-CoV-2 antibodies recognize 23 distinct epitopic sites on the receptor binding domain
- Jiansheng Jiang
- Christopher T. Boughter
- David H. Margulies
Communications Biology (2023)
Dynamic stability of salt stable cowpea chlorotic mottle virus capsid protein dimers and pentamers of dimers
- Janos Szoverfi
- Szilard N. Fejer
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.