Computational modeling and experimental validation of the EPI-X4/CXCR4 complex allows rational design of small peptide antagonists

EPI-X4, a 16-mer fragment of albumin, is a specific endogenous antagonist and inverse agonist of the CXC-motif-chemokine receptor 4 (CXCR4) and thus a key regulator of CXCR4 function. Accordingly, activity-optimized synthetic derivatives of EPI-X4 are promising leads for the therapy of CXCR4-linked disorders such as cancer or inflammatory diseases. We investigated the binding of EPI-X4 to CXCR4, which so far remained unclear, by means of biomolecular simulations combined with experimental mutagenesis and activity studies. We found that EPI-X4 interacts through its N-terminal residues with CXCR4 and identified its key interaction motifs, explaining receptor antagonization. Using this model, we developed shortened EPI-X4 derivatives (7-mers) with optimized receptor antagonizing properties as new leads for the development of CXCR4 inhibitors. Our work reveals the molecular details and mechanism by which the first endogenous peptide antagonist of CXCR4 interacts with its receptor and provides a foundation for the rational design of improved EPI-X4 derivatives.

Amino acid substitutions were introduced in the sequence of CXCR4 by site-directed mutagenesis, cloned into an IRES-GFP expression vector and transfected into 293T cells. Afterwards, cells were incubated with serially diluted EPI-X4, WSC02, or JM#21 in presence of a constant concentration of CXCR4 specific antibody (clone 12G5). After 2 hours, bound antibody was analyzed by flow cytometry. Shown are data derived from at least 3 individual experiments ± SEM. (see also Figure 6) Figure S10. C-terminally truncated EPI-X4 competes with an CXCR4 specific antibody. EPI-X4 analogues were designed that are serially truncated at the C-terminus. a) Peptides were serially diluted and added to SupT1 cells together with a constant concentration of a CXCR4 antibody that binds close to the binding pocket of the receptor. After 2 hours, unbound antibody was removed and antibody binding determined by flow cytometry. b) IC 50 values were determined by non-linear regression. Shown are data derived from one single assay.  for 24 hours. Afterwards each mixture was centrifuged at 20,000 x g for 5 min. The supernatant was then analyzed for peptide concentration using BCA assay. For comparison the peptide was freshly dissolved shortly before the assay. Shown is one representative experiment.     Figure S18. EPI-X4 blue to red: N-terminus to C-terminus with 50 mM NaOAc. The first two amino acids of the N-terminus of EPI-X4 are free and flexible, while the C-terminus is engaged in hydrogen bonds between Thr5 and Gln10 sidechains as well as a hydrogen bond between the backbone carbonyl of Thr5 and the amide hydrogen of Val11, which can be also validated by existing NOE signals in the corresponding NMR spectrum. Additionally, the calculated structure states propose that the sidechain of Ser12 and the carboxylic terminus of Leu16 can be engaged in hydrogen bonds with each other. In those states where the N-terminus comes close to the C-terminus, the sidechain guanidino group of Arg3 can build a hydrogen bond with the carboxylic end of the peptide chain.
However, for the contacts between Ser12 and Leu16, as well as between Arg3 and Leu16, no NOE signals were detected. Thus, although those contacts can in theory be established, they are not as populated in the conformational range as to be detected via NMR.  Residues of EPI-X4 are highlighted in red italics. The interaction energies were calculated every 0.5 ns from the simulation trajectories under vacuum conditions (i.e., the contributions to the energy by water, membrane and ions were neglected) at the force field level. The errors were estimated by bootstrap analysis using 500 steps.

Analysis of the NMR spectra
Although the linewidths of the NMR spectra were broadened in acidic environment with respect to the neutral medium, no changes in the chemical shift region representing the amide backbone protons were found ( Figure S15)  Figure S16). Otherwise, the loss of cross-peaks in the NOESY (Nuclear Overhauser Effect Spectroscopy) spectrum indicates a loss of dipolar spatial couplings between protons of EPI-X4 ( Figure S17). Hence, EPI-X4 gains in flexibility and mobility when it is exposed to ions ( Figure S18).

Coarse-Grained (CG) pulling simulations
Our results indicated that when the force constant is very small (kf=0.1 kJ/mol/Å 2 and kf=0. MHz AVANCE III Bruker system equipped with a 5 mm quadruple resonance QXI 1H/13C/15N/31P probe with a z-gradient. Experiments were carried out at 298 K. Nuclear Overhauser Effect Spectroscopy (NOESY) spectra acquiring 2D homonuclear correlation via dipolar coupling with water suppression using watergate W5 pulse sequence with gradients 1,2 were recorded for a mixing time of 100, 200 and 300 ms, using 2 x 16k x 256 data matrices, corresponding to acquisition times of ~480 and 8 ms in the t1 and t2 dimensions, respectively.
Through-bond connectivity was obtained from a Total Correlation Spectroscopy (TOCSY) spectrum recorded with the MLEV-17 mixing scheme 3 with water suppression using 3-9-19 pulse sequence with gradients 4,5 , using a 13 µs 90° pulse and a 80 ms mixing period.
NMRFAM-Sparky was used for signal assignment and NOE signal volume determination. 6 For NOE signal integration a gaussian fit was used with allowing peak motion and adjusting linewidths and baseline fitting. For the 3D structure calculation of EPI-X4 the software package ARIA (Ambiguous Restraints for Iterative Assignment) was used. 7

Coarse-Grained (CG) pulling simulations
The CXCR4 receptor complexed with the EPI-X4 derivative (EPI-X4D: 408-416) was simulated at the coarse-grained (CG) level in a POPC lipid-water environment. The CHARMM-GUI server was used to generate the initial configurations. 8 We employed the Martini 2.2 CG force field 9 implemented for GROMACS program (version 2016.3) 10 . To reduce the system size, the N-terminal loop (corresponding to residues 1-32) and the Cterminal loop (residues 304-319) were removed, and the termini were set as neutral in the CG model. The truncated CXCR4 along with the EPI-X4 derivative (EPI-X4D) were embedded in a POPC bilayer (consisting of 254 lipids) and solvated with an uncharged water model. 9 The total charge of the system was neutralized by the addition of nine chloride ions. In total, the system contained ~10000 CG particles. Constraints were applied to keep the regular secondary structures intact. NPT simulations were performed using velocity rescale thermostat at 310 K and a Berendsen barostat at 1 atm. 11,12 The relative dielectric constant, εr=15 was used to account for the screening effect of uncharged water. The short-range LJ interactions were cutoff at 11 Å and shift scheme was used to smoothen the potential. The electrostatic interactions were computed with the reaction-field approach 13 , using a cutoff of 11 Å. A time step of 20 fs was used for the integration of position and velocities.
Three equilibration MD simulations were performed with position restraints on the secondary structures using force constants of 10, 5 and 1 kcal/mol/Å 2 , respectively. The production MD was performed with position restraints operating only on the transmembrane helices of CXCR4 with a force constant of 0.1 kcal/mol/Å 2 . This was necessary to restrict the excessive displacement of the helices.
To simulate the self-assembly of EPI-X4D in the CXCR4 pocket, the peptide was pulled from the solution phase to the binding pockets by constant-force MD simulations at the CG level.
For this purpose, we defined a reference point located between the major and minor binding pockets (obtained as the center of geometry of four binding pocket residues D187, E288, D262 and D97). The coordinates of this point served as the absolute reference for the pulling simulations. The peptides were kept in the solution phase and at a distance of ~30 Å from the reference point. A 1 μs MD simulation was performed applying the position restraints on the central residue of the peptide (kf=10 kcal/mol/Å 2 ) as well as on the TM-helices (kf=0.1 kcal/mol/Å 2 ). From this simulation, 20 snapshots (with the interval of 50 ns) were taken for the pulling simulations.
Different force constants ranging from 0.025 to 2.5 kcal/mol/Å 2 were tested for constant force simulations. We applied biasing forces to the peptide on three different points (i.e., residues 1, 5 and 9) towards the binding pocket. The pulling forces were equal in magnitude (same force constant) on all the three sites and were operating simultaneously. This ensures that the pulling is not biased to a particular site on the peptide. The aim of these simulations was to determine if there is a preferred pattern or binding mode. Therefore, we used very small force constants to allow plenty of conformation sampling that reduces the bias associated with initial configurations. Using twenty different initial structures, we performed twenty simulations, each for 1 µs and under three force constants setups: 0.1, 0.2 and 0.5 kJ/mol/Å 2 .