Communication pathways bridge local and global conformations in an IgG4 antibody

The affinity of an antibody for its antigen is primarily determined by the specific sequence and structural arrangement of the complementarity-determining regions (CDRs). Recent evidence, however, points toward a nontrivial relation between the CDR and distal sites: variations in the binding strengths have been observed upon mutating residues separated from the paratope by several nanometers, thus suggesting the existence of a communication network within antibodies, whose extension and relevance might be deeper than insofar expected. In this work, we test this hypothesis by means of molecular dynamics (MD) simulations of the IgG4 monoclonal antibody pembrolizumab, an approved drug that targets the programmed cell death protein 1 (PD-1). The molecule is simulated in both the apo and holo states, totalling 4 μs of MD trajectory. The analysis of these simulations shows that the bound antibody explores a restricted range of conformations with respect to the apo one, and that the global conformation of the molecule correlates with that of the CDR. These results support the hypothesis that pembrolizumab featues a multi-scale hierarchy of intertwined global and local conformational changes. The analysis pipeline developed in this work is general, and it can help shed further light on the mechanistic aspects of antibody function.


S1 Methods
Principal component analysis In order to tell whether two structural domains A and B move in the same direction in a cluster, we computed the following quantity for each of the first three principal components: Here, v i and u j are vectors of the k-th mode, relative to the residues i and j, and N and M are the numbers of residues in domains A and B, respectively. Results for each mode are combined, and weighted by their corresponding eigenvalues λ k : Each entry Q AB in the resulting matrix refers to a couple of structural domains. Positive values correspond to parallel movements, while negative values to motions in an antiparallel direction.
MM/PBSA calculations In the MM/PBSA approach [1], the binding free energy ∆G bind between the antibody (Ab) and the antigen (Ag), namely is written as a sum of different contribution: Here, ∆E M M , ∆G sol and ∆S are the changes in the gas-phase molecular mechanics energy, solvation free energy, and conformational entropy upon binding, respectively. More specifically, ∆E M M = ∆E int + ∆E ele + ∆E vdW (5) ∆G sol = ∆G P B + ∆G SA (6) ∆E M M includes the changes in the internal energy ∆E int (due to bonded interactions), electrostatic energies ∆E ele , and the van der Waals energies ∆E vdW . ∆G sol is the sum of the electrostatic solvation energy ∆G P B and the nonpolar term ∆G SA between the solute and the continuum solvent. ∆G sol is calculated using the Poisson-Boltzmann model, while the nonpolar energy is estimated on the basis of the accessible surface area of the proteins.
In our calculations, we neglected the changes in conformational entropy, focusing therefore on the enthalpic contributions to the binding energy. These were estimated with the g_mmpbsa tool [2]. A convergence analysis was performed to estimate the appropriate number of frames ( Figure S1), which are extracted from the trajectory at regular intervals. The average values and the corresponding errors are obtained with a bootstrap analysis. With ∼1500 frames, the result is fully converged with an acceptable uncertainty. Figure S1: Convergence of the binding energy computed with the MMPBSA method as a function of the number of frames employed for the calculations.
Network analysis As explained in detail in the main text, the protein network is divided in communities, in order to identify functional subdomains of the antibody. As a measure of the quality of the community structure, the modularity parameters Q was calculated (Table S1)). Q represents the difference in probability of intra-and intercommunity connections for a given community repartition, and is defined as: where e ij is the fraction of edges that links nodes in community i to nodes in community j, and a i = j e ij is the fraction of edges from all communities that connect to nodes belonging to community i. The range of modularity values is between 0 and 1. Values close to 1 identify high-quality community structures, favouring connections intra-community with respect to inter-communities ones. All clusters from apo and holo simulations show similar values of modularity, which is very high in all the cases (Table S1). The above-average values (which are usually found between 0.4-0.7 for protein networks) can be explained on the basis of the natural partition of antibodies in structural domains.
Moreover, communities were used as a mean to evaluate the suitability of the cutoffs defining the edges. In order to evaluate the effect of the choice of the cutoffs on the resulting communities, the Community Repartition Difference (CRD) is calculated. The CRD between two network repartitions c 1 and c 2 is defined as: where z(n i , n j , c k ) is defined as 1 if nodes n i and n j belong to the same community in a given network partition c k , and 0 otherwise. A value of CRD equal to 0 indicates identical repartitions, while a value equal to 1 corresponds to the case of totally different communities. We computed CRD values for several combinations of distance/frame cutoffs, with respect to the reference case of 4.5 Å distance cutoff and 75% frame cutoff ( Figure S2). These two values, which are in the range commonly used for protein network analyses [3], are the ones chosen for this study. In all cases, the value of CRD is below 0.2, indicating that a change of the parameters within the range investigated does not lead to substantial changes in the resulting community repartitions. Figure S2: Community Repartition Difference (CRD) computed for different choices of the parameters defining the edges of the network. The "distance cutoff" (ranging here from 3.5 Å to 5.0 Å defines the maximum distance for which two atoms are considered in contact, while the "frame cutoff" (ranging here from 65% to 85%) defines the percentage of frames in which the contact is formed. The values reported in the plot refer to the CRD between each of the possible combination of the two cutoffs and the reference case with 4.5 Å distance cutoff and 75% frame cutoff.
Images Images of the proteins were produced by using VMD [4] and Protein Imager [5], and the graphs were made with python libraries.  Figure S3: Schematic representation of the antibody pembrolizumab in complex with antigen PD-1, with the indication of the name of each structural domain. Chains C and D correspond to chains F and G in the original PDB file, respectively [6]. The numbering on the right corresponds to the indices used in the correlation matrices. Figure S4: RMSD of the C α atoms calculated for the full antibody in the apo state, for each of the 500ns-long runs. The large, sudden increase in RMSD in the first few ns of simulation corresponds to the relaxation of the starting structure to a more equilibrated state; from there, the large fluctuations observed correspond to the different conformational changes sampled during the simulations. Figure S5: RMSD of the C α atoms calculated for the full antibody in the holo state, for each of the 500ns-long runs. The large, sudden increase in RMSD in the first few ns of simulation corresponds to the relaxation of the starting structure to a more equilibrated state; from there, the large fluctuations observed correspond to the different conformational changes sampled during the simulations. Figure S6: RMSD of the C α atoms calculated for each structural domain of the antibody, in the four 500ns-long trajectories of the apo state. Figure S7: RMSD of the C α atoms calculated for each structural domain of the antibody, in the four 500ns-long trajectories of the holo state. Figure S8: Timeline of the clustering assignment, in the apo (top) and holo (bottom) systems. Each system is simulated for a total of 2 µs, in four independent replicas of 500 ns; colors are used to distinguish different trajectories. Figure S9: RMSD distribution of antibody structure, for each cluster with respect to the representative conformation. Left: Apo case. Right: Holo case. The threshold used to distinguish among clusters is 1.2 nm. In the UPGMA hierarchical clustering employed, such threshold does not refer to the pairwise distance between observations; instead, it represents the average distance between elements of each cluster [7]. This explains the presence of a few RMSD values larger than 1.2 nm within the same cluster. Figure S10: Overall occurrence of the conformational clusters in the apo and holo states. Figure S11: Root-mean-square fluctuations (RMSF) of the C α atoms, for each cluster of the antibody in the apo state. To assess convergence of the results, frames belonging to the same cluster have been shuffled by randomly changing their order; the RMSF has later been computed on the full, shuffled cluster population and on its two halves. Differences are typically within the line width.  Figure S12: Root-mean-square fluctuations (RMSF) of the C α atoms, for each cluster of the antibody in the holo state. To assess convergence of the results, frames belonging to the same cluster have been shuffled by randomly changing their order; the RMSF has later been computed on the full, shuffled cluster population and on its two halves. Differences are typically within the line width.

Structural alignment with 4C54 Fc
The rotated CH2 domain accommodates Fab1 Overlap Fc-Fab1

Fc Fc
Chain D Chain B Figure   Figure S15: Average RMSD between frames belonging to each pair of clusters. The RMSD has been computed on all the C α atoms of the antibody after structural alignment. Cells within the red border compare clusters of the apo system, while cells within the white border refer to the holo system.

S2.1 Principal components analysis
Cluster 0 A Fab1 and Fc move apart from Fab2, while Fab1 and Fc remain in close contact during the movement, giving rise to a large contact surface. Fc is rotating in such a way that CH2 of chain B is moving apart from CL of Fab2.
a. b. c.

Cluster 1 A
We observed a high mobility of the variable region of Fab2, in particular VH. The dynamics is similar to the one of cluster 2 A , but this time the CH3 domains tend to move in the same direction of the variable domains of Fab2.
a. b. c.    Rotation of Fab2 takes place until the CL is in contact with the Fc (indeed, this is the only one among the compact clusters in which atomistic non-bonded interactions are formed between residues belonging to these two domains). This happens by means of parallel motions of the CH3 domains and the variable domains in Fab2. At the same time, the distance between the lower hinge segments is reduced, inducing a rotation of Fab1 in the opposite direction of Fab2. This movement of Fab1 is parallel to that of the remaining domains in chain D.
a. b. c.                 In the holo case we observe that, also in the most open conformations (cluster 3 H ), the hinge region still plays an important role for information transfer, in particular when compared to the apo case. Again, prolines appear as key residues. In cluster 0 H , there are no significant interactions between the hinge residues and the nearby domains. In all clusters, none of the residues of H2 is involved in high-centrality contacts. Several paths with highest betweenness include instead H1 residues. In cluster 2 H , pathways with high betweenness include:   The values refer to the hinge surface alone, computed without taking into account hindrance from nearby domains. The two segments appear highly asymmetrical; in particular, the smaller SASA for hinge 1 is indicative of a more compact, less accessible conformation. Figure S44: Difference between the total C α RMSF of the two hinge chains, in the clusters of apo and holo states. In the holo case, hinge 1 appears overall less flexible than hinge 2, while in the apo systems this is true only for 2 of the 5 conformational clusters. Figure S45: Comparison between the PAD ω parameter in the apo and holo antibody. PAD ω measures of backbone torsional plasticity through the spreading of φ and ψ dihedral angles. The blue-shaded areas correspond to the residues involved in the binding with PD1; as expected, the bound antigen leads to a decreased flexibility in the paratope of Fab2. The yellow-shaded areas correspond to the cysteine residues forming inter-chain disulfide bonds in the hinge.  Figure S46: Plots of the RMSD of binding site residues (C α ) in the conformational clusters of the apo (a) and holo (b) systems, with respect to the crystallographic structure of the Fab/PD-1 complex. The deviations are calculated after fitting the trajectories on the variable region (Fab2) of the antibody. In the holo case, clusters present a more uniform behaviour with respect to the apo system.