Introduction

Protein α-helices are often involved in interactions with DNA, RNA or other proteins1. These complexes regulate many important biological processes, but are widely considered difficult targets for drug development. In this context, there has been a strong interest in developing synthetic small molecules that mimic the topology of α-helices, as this would facilitate the drug discovery process while potentially overcoming the pharmacokinetic limitations often encountered when using peptides as drugs2,3,4. Hamilton et al. pioneered this field by reporting that tris-substituted 3,2′,2″-terphenyl molecules reproduced the angular orientation of side chains i, i + 4 and i + 7 of an α-helix, and were capable of blocking protein-protein interactions5,6. However, Hamilton’s design was restricted to terphenyls substituted on one side of the molecule, mimicking just one face of an α-helix and limiting the possible therapeutic applications to superficial protein-protein interactions. We recently reported that terphenyl molecules with bilateral 3,5, 2′,6′,2″,6″ substitutions adopted a staggered conformation that matched the projection of side chains i, i + 1. i + 4, i + 5, i + 7 and i + 8, thereby mimicking all three faces of an α-helix and opening up the possibility of mimicking interactions in which the helix is deeply embedded in its receptor (Fig. 1)7. One such interaction is formed between the RNA-binding α-helix of the HIV-1 protein Rev and the virus RNA. The Rev protein adopts a helix-turn-helix conformation8,9,10, and the Nt-segment of the second helix contains a positively-charged arginine-rich motif, T34RQARRNRRRRWRERQR50 (hereafter identified as Rev34–50) that drives RNA-binding and also functions as a nuclear localization signal (NLS). The Rev34–50 helix forms a high-affinity interaction with an internal loop located within subdomain IIB of the Rev Recognition Element (RRE) of the viral RNA (Fig. 1C)11,12. This interaction is essential for virus viability, as it triggers the cooperative incorporation of additional Rev molecules into the complex through interactions between Rev34–50 helices and further sites on the RRE as well as protein-protein contacts13, allowing nuclear export of unspliced or singly-spliced viral RNA molecules in the late phase of the virus cycle14. Nevertheless, despite its importance in the viral replication cycle, the RRE-Rev ribonucleoprotein complex remains an unexploited target for HIV-1 chemotherapy. While a number of small-molecule compounds with substantial anti-HIV activity have been reported to block Rev function, most of them do not directly inhibit the formation of the RRE-Rev complex, or were found to bind to Rev partners in the host cell such as Crm1 or the cap-binding complex15,16,17,18,19.

Figure 1
figure 1

(A) Comparison of bilateral terphenyls with side chains in relative 1,3 (left) and 1,4 (right) positions. In both cases, the aminoalkyl side chains imitate the Arg residues of Rev. (B) Schematic overlay of a 1,4-bilaterally substituted terphenyl and a protein α-helix showing the mimicked residues (cyan). (C) Three-dimensional view of the complex formed between subdomain IIB of the RRE (grey) and the Rev34–50 helix of the HIV-1 protein Rev (yellow)12. The extrahelical loop residues A19 and U23 are coloured light green. The image was generated with MOE 2019.0102 (www.chemcomp.com). (D) Secondary structure of RNA hairpins IIBh, containing the high-affinity Rev34–50 binding site, and TARh, used as a specificity control. For fluorescence intensity assays a fluorescein probe was linked to the extra-helical loop residues U23 of IIBh and U8 of TARh (indicated with asterisks).

In this work, we explore the use of bilateral terphenyl molecules containing substitutions in relative 1,4 positions across the p-terphenyl scaffold. Relative to the previously reported 1,3-terphenyl compounds7, these molecules match a different set of α-helix side chains and offer the major advantage of a simplified synthetic route due to increased symmetry (Fig. 1A). The most active compounds successfully inhibited the formation of both the RRE IIB- Rev34–50 and full-length RRE-Rev complexes, and blocked HIV-1 replication with little cellular toxicity. Furthermore, a detailed analysis of RNA recognition properties and cellular effects revealed that these molecules likely act through a multi-target mechanism involving inhibition of RNA transcription and Rev function.

Results

Synthesis of 1,4-terphenyl compounds

In terms of synthesis, the major advantage that this series of terphenyl compounds presented over the previously described 1,3 series was the complete symmetry of the side chains. Consequently, the synthesis was drastically simplified in two ways: (i) the two terminal phenyl rings were structurally equivalent, therefore the synthesis of an entire separate phenyl ring was saved; and (ii) just one Suzuki-Miyaura cross-coupling step was required to construct the desired terphenyl scaffold, rather than two subsequent coupling steps seen in the previous 1,3 series7. Hence, the synthesis of the final compounds consisted in the prior preparation of just two key synthons. The terminal synthons were aryl bromides bearing nitrile groups on the two side chains as masked amines, ready to be revealed in later steps, whereas the central synthon presented two alkyl side chains and two boronic esters. A simple double palladium-catalyzed Suzuki-Miyaura cross-coupling between two terminal synthons to every one central synthon constructed the desired terphenyl scaffold, and a subsequent borane-mediated reduction of the nitrile groups resulted in the amines required to mimic the arginine residues in Rev34–50 (Fig. S1).

Following this methodology, a library of terphenyl compounds bearing bilateral 1,4-side groups was generated. Within this library, we explored varying the length of the aminoalkyl side chains (terphenyl compounds 1 vs 2), as well as different substitutions in the pole positions and different alkyl substitutions on the central phenyl ring (terphenyls 1a-d). Furthermore, the effect of lowering the positive charge of the compounds was also investigated with terphenyls 3 and 4, which contained just three aminoalkyl side chains in varying positions (Fig. 2).

Figure 2
figure 2

Chemical structure of 1,4-substituted p-terphenyl compounds analysed in this study.

Inhibition of the RRE-Rev interaction

Molecular modelling calculations indicated that 1,4-terphenyl molecules could approximately match side chains i, i + 2, i + 4, i + 6, i + 7 and i + 9 of an α-helix (Fig. 1B). When applied to the Rev34–50 α-helix, the side chains of our terphenyl compounds were found to coincide with Rev residues reported to be essential for the interaction with subdomain IIB (Fig. S2).

We first evaluated whether 1,4-terphenyl molecules were capable of inhibiting the high-affinity interaction between Rev34–50 and RRE subdomain IIB by using a displacement experiment based on fluorescence anisotropy7,20. Terphenyls 1a and 1d inhibited the IIBh-Rev34–50 contact with IC50 values of 14 and 47 μM respectively. The remaining terphenyl compounds had weaker or non-detectable activity at the assay concentrations (Figs. 3A and S3 and Table 1).

Figure 3
figure 3

Inhibition of subdomain IIB-Rev34–50 and full-length RRE–Rev complex formation by 1,4-terphenyl compounds. (A) Curves of IIBh-Rev34–50 inhibition by terphenyls 1a and 1d, as measured by fluorescence anisotropy. (B) Inhibition of the RRE-Rev ribonucleoprotein complex by terphenyl compounds, analysed by EMSA experiments. The gel lanes contained 78 nM RRE, 1.32 μM Rev (marked with +) and the specified concentrations of compound 1a, 1d, 3c and 1b. The image shows on the right a schematic representation of the RRE-Rev ribonucleoprotein, with the location of the main Rev binding site in subdomain IIB indicated and the Rev monomers represented with ribbons and blue ovals, together with plots of high-order RRE-Rev band intensity as a function of terphenyl 1a and 1d concentration; the error bars represent the standard deviation of three independent experiments.

Table 1 50% inhibitory concentrations of 1,4-terphenyl molecules for RRE subdomain IIB-Rev34–50 and high-order RRE-Rev complex formation.

After the high-affinity interaction between RRE subdomain IIB and the Rev34–50 helix of the first Rev monomer is established, the RRE-Rev ribonucleoprotein is formed by the incorporation of additional Rev units binding to further sites in the RRE (Fig. 3B, right)9,10,13. Using an electrophoretic mobility shift assay (EMSA) involving full-length RRE and Rev, we also evaluated whether 1,4-terphenyl compounds were capable of interfering with the formation of the ribonucleoprotein. The results indicated that compounds 1a and 1d inhibited the RRE-Rev interaction (Fig. 3B). The effect was particularly prominent for high-order complexes (containing a greater number of Rev monomers), but the inhibition of low-order complexes by 1d was also detected at concentrations consistent with the IC50 value measured in the IIB-Rev34–50 displacement experiment. Compounds 1b and 3c exerted a weaker effect on the RRE-Rev complex, in agreement with the results obtained in the displacement experiments monitored by fluorescence anisotropy (Fig. 3B and Table 1).

RRE subdomain IIB RNA recognition

We next determined whether the terphenyl molecules blocked the interaction between Rev34–50 and subdomain IIB by binding to the RNA and, if so, whether they recognized subdomain IIB in a manner similar to the Rev34–50 α-helix. We first evaluated subdomain IIB association by measuring changes in the fluorescence intensity of a IIBh RNA hairpin construct containing a fluorescein probe attached to unpaired loop IIB residue U23 (Fig. 1D)21,22. The binding curves obtained at low ionic strength indicated that all molecules associated to RRE subdomain IIB RNA, and were best fit with a two-site model (Figs. 4A and S4). However, there were significant differences among the compounds.

Figure 4
figure 4

RRE subdomain IIB recognition by 1,4-terphenyl compounds. (A) Results of fluorescence intensity experiments. Left: comparison between first-site (Kd1) and second-site (Kd2) equilibrium dissociation constants for the interaction between IIBh and 1,4-terphenyl molecules. Right: comparison of the IIBh binding curve of terphenyl 1a (black) with the TARh association curve (magenta), and with IIBh binding curves obtained in the presence of a 10-fold molar excess of unlabelled competitor RNA (tRNALys; red) or unlabelled competitor double-helical DNA (LTRd; blue). Solution conditions: 10 mM sodium phosphate pH 6.6 and 0.1 mM EDTA. (B) Titration of IIBh with terphenyl 1a monitored by NMR spectroscopy. The H5-H6 region of the TOCSY spectrum of unbound IIBh (blue) is superposed on the spectra of complexes with increasing RNA:1a molar ratios, color-coded as indicated in the graph. A map of the 1a binding site in the IIBh hairpin is shown on the right. Nucleotides whose aromatic protons undergo chemical shift variations upon the addition of two equivalents of 1a are highlighted in orange and red (Δδ ≥0.04 and 0.08 ppm, respectively). Nucleotides with overlapped aromatic resonances are black-coloured, and residues whose aromatic signals were not affected by ligand binding are coloured grey. Solution conditions: 10 mM sodium phosphate pH 6.0 and 0.1 mM EDTA. (C) Model of a 1:1 complex between RRE loop IIB and 1a (depicted with yellow carbon atoms), obtained from unrestrained docking calculations with the 4PMI PDB structure12. The image was generated with MOE 2019.0102 (www.chemcomp.com) and shows superimposed the converged docking poses of 1a.

The two terphenyl molecules that inhibited the RRE subdomain IIB-Rev34–50 and full-length RRE-Rev interactions, 1a and 1d, presented lower second-site equilibrium dissociation constant (Kd2) when compared to the other molecules, in addition to a low Kd1. In contrast, compounds 1b, 1c, 2a and 2b had similarly low Kd1 but higher Kd2 values. On the other hand, terphenyls 3a-c and 4, bearing just three 2-aminoethyl groups, had both higher Kd1 and Kd2 constants (Fig. 4A and Tables 2 and S1). Experiments carried out at higher ionic strength supported these conclusions: the IIBh Kd’s of terphenyls 1a and 1d were considerably lower than those of compounds 3c and 1b (Fig. S5 and Table S2), both of which displayed weaker RRE-Rev inhibitory activity.

Table 2 IIBh interaction parameters for selected 1,4-terphenyl molecules, measured by fluorescence intensity experiments.

To evaluate the specificity of subdomain IIB recognition, we measured binding to a control TARh hairpin containing the HIV-1 Tat-binding UCU bulge23 (Fig. 1D). The Kd(TARh)/Kd(IIBh) specificity ratios of compounds 1a and 1d ranged between 3 and 25, depending on ionic strength (Figs. 4A, S4 and S5 and Tables 2 and S2). We further assessed specificity by duplicating the IIBh-23fl association experiments in the presence of a 10-fold molar excess of tRNALys or a 26-base pair LTRd DNA duplex21. The binding curves of the compounds were affected by the presence of competitive tRNALys and to a lesser extent by LTRd (Figs. 4A, S4 and S5 and Tables 2 and S2).

We next used NMR spectroscopy to identify the binding site of terphenyls 1a, 1d and 3c within hairpin IIBh. All three compounds induced significant chemical shift changes in internal loop IIB and adjacent nucleotides only, and these variations were observed at low RNA:terphenyl molar ratios (1:1 and 1:2; Figs. 4B and S6). This indicated that the compounds interacted with the intended Rev34–50 binding site in subdomain IIB, and that the interactions were loop IIB-specific within the IIBh hairpin. Under conditions of fast exchange between bound and unbound states, terphenyls 1a and 1d induced larger chemical shift perturbations than 3c at the same molar ratios (Figs. 4B and S6). Assuming similar structures for the RNA-terphenyl complexes, this suggested that 1a and 1d had greater affinity for loop IIB relative to 3c, in line with the results obtained with fluorescence experiments. We also detected intermolecular NOEs between H1’ of the extrahelical A19 loop nucleotide and the aminoethyl protons of compound 1a, implying that the ligand associated to the loop from the major groove side, as observed for Rev34–50 (Fig. 1C)11,12. In fact, unrestrained docking calculations supported that one molecule of 1a bound diagonally across the major groove of loop IIB, occupying the binding site of the N-terminal segment of the Rev34–50 helix (Fig. 4C). One pair of bilateral aminoethyl groups contacted phosphate groups located in opposite strands, whereas the other pair of aminoethyl groups bound to the pocket formed by the S-turn residues G21 and G22, and the extrahelical A19 nucleotide, where several phosphate groups are in close proximity to each other.

Antiretroviral activity and cellular toxicity

When the antiviral activities of 1,4-terphenyl compounds were evaluated with a cellular HIV-1 infection assay, the groups attached to the terphenyl scaffold were found to have a substantial impact on the activities, as observed in the previous in vitro experiments. Compounds 1a and 3c had significant activity in the infection experiment, with EC50 values of 10.6 and 14.1 μM respectively, followed by 3a and 1d (EC50 = 35.5 and 57.9 μM respectively), and finally 4 (Table 3 and Figs. 5A and S7). The remaining terphenyls were inactive at the assay concentrations (up to 100 μM). Notably, none of the terphenyl compounds were toxic at concentrations below 100 μM (Table 3 and Figs. 5A and S7). These experiments clearly show the antiviral effect of these compounds on the HIV-1 cycle.

Table 3 Results of cellular assays for 1,4-terphenyl compounds. Inhibitory activity (EC50) in experiments based on infection with HIV virus and transfection with HIV- and LTR-dependent vectors, and cellular toxicity (CC50).
Figure 5
figure 5

Antiviral activity and cellular toxicity of selected 1,4-terphenyl molecules 1a, 1d and 3c. (A) Antiviral activity in HIV-1 infection assays and cellular toxicity as function of compound concentration. (B) Inhibition of post-integration steps in HIV-1 transfection assays. In all cases, the results are expressed as percentage of relative luminescence units (RLUs), where 100% is the level of viral replication obtained in the presence of the vehicle used to dissolve the compounds.

Antiretroviral mechanism

The processes involved in the antiretroviral action of terphenyls 1a, 1d and 3c were studied with additional cellular assays. A possible effect on reverse transcription was assessed by measuring the levels of early and late HIV-1 reverse transcripts in the absence and presence of two terphenyl concentrations. None of the molecules interfered with the levels of late reverse DNA, although terphenyls 1a and 3c diminished the levels of early viral DNA sequences by 25%, suggesting that early reverse transcription copying over the LTR region could be affected by the presence of the compounds (Fig. S8). Overall, we only found a partial action of terphenyls on reverse transcription.

In an experiment based on transfecting a full-length competent HIV-1 vector, the EC50 values for compounds 1a, 1d and 3c ranged between 6.0 and 15.4 μM, close to those obtained in the infection assay (Fig. 5B and Table 3). This result indicated that the molecules mainly acted on transcriptional or post-transcriptional steps of the virus cycle.

We then specifically tested whether terphenyls 1a, 1d and 3c had an effect on the RRE-Rev system in cell culture by quantifying the levels of unspliced, single-spliced and multiple-spliced viral transcripts using RT-qPCR experiments21,22. Given that splicing takes place in the nucleus, a blockage of the RRE-Rev system should reduce the levels of unspliced or single-spliced transcripts and increase the proportion of multiple-spliced species. Although we did not find statistically significant results, the clearest patterns consistent with RRE-Rev inhibition were detected for compound 1d at 24 and 48 hours post-infection and 100 μM concentration (Figs. 6A and S9).

Figure 6
figure 6

Effect of terphenyls 1a, 1d and 3c on HIV-1 RNA splicing and LTR-dependent gene expression. (A) Relative quantities (RQ) of unspliced, single-spliced and multiple-spliced HIV-1 RNA transcripts in cells treated with two concentrations of 1a, 1d or 3c. The image shows transcript quantities measured at 48 hours after infection, using transcript levels obtained from untreated cells as a reference (RQ = 1). (B) Inhibition of HIV-1 LTR-dependent gene expression as a function of terphenyl concentration. The results are expressed as a percentage of relative luminescence units (RLUs), where 100% is the luminescence level obtained in the presence of the vehicle used to dissolve the compounds.

We also determined the effect of 1a, 1d and 3c on viral transcription using an experiment based on transfecting a plasmid encoding a luciferase gene whose expression depends on the long terminal repeat (LTR) promoter of HIV-124. All three compounds inhibited LTR-dependent expression with approximately similar EC50 values, which were slightly lower relative to those measured in the HIV-1 transfection experiments (Fig. 6B and Table 3). To further assess the mechanism of antiviral action of the compounds, we checked whether they also interfered with the regulatory sequences of HTLV-1, a closely related retrovirus. Our results showed inhibition of luciferase expression driven by the LTR promoter of HTLV-1 with higher EC50 values (Table 3).

The IIBh association curves of terphenyls 1a and 1d were moderately displaced in the presence of a 10-fold molar excess of competitor LTRd DNA duplex (Figs. 4A, S4 and S5 and Tables 2 and S2). This duplex comprised binding sequences of transcription factors NF-κB and Sp-121, both of which have been shown to be essential for HIV-1 LTR promoter activity and virus replication25,26,27,28. To evaluate whether the observed effects of terphenyls 1a, 1d and 3c on LTR-dependent expression were exerted through LTR association, we again used EMSA experiments to compare the association to a 58-base pair LTRc sequence, corresponding to the core region of the HIV-1 LTR promoter and comprising several NF-κB and Sp1sites. The results indicated that terphenyls 1a and 1d, but not 3c or the inactive control 1b, associated to LTRc at low μM concentrations (Fig. S10A). Isothermal titration calorimetry (ITC) experiments were subsequently applied to compare the binding of terphenyl 1a to RRE subdomain IIB and a related DNA duplex. We obtained equilibrium dissociation constants of 0.5 and 6.3 μM, respectively, indicating that the affinity of the compound for RRE subdomain IIB was approximately one order of magnitude higher relative to the DNA duplex (Fig. S10B).

Discussion

Here we describe the properties of terphenyl compounds with 1,4-bilateral substitutions designed to mimic all faces of the RNA-binding α-helix of Rev. By mimicking the Rev34–50 helix, the compounds were intended to occupy not only the high-affinity site in subdomain IIB of the RRE, but also the remaining RRE sites recognized by the Rev monomers, thereby blocking the formation of the RRE-Rev ribonucleoprotein complex, which remains an unexploited target in HIV-1 chemotherapy. Given the arginine-rich Rev34–50 α-helix, the terphenyl compounds contained several positively charged aminoalkyl side chains. The presence of this positive charge increased the aqueous solubility of the p-terphenyl scaffold, but compromised binding specificity and also influenced the antiviral mechanism of the compounds, as discussed below. No cellular toxicity, however, was detected for any of the compounds. Furthermore, a comparison between the active concentrations obtained in experiments involving purified nucleic acid and protein species and those carried out in cells (Tables 1, 2, 3 and S2) suggested optimal cellular and nuclear penetration properties.

Fluorescence anisotropy and electrophoresis experiments indicated that terphenyls 1a and 1d, containing four bilateral 2-aminoethyl side chains and two bilateral methyl groups on the central aromatic ring, inhibited formation of the RRE-Rev complex with low-μM IC50 values (Table 1 and Fig. 3). NMR spectroscopy analyses showed that these molecules occupied the Rev34–50 site in loop IIB (Figs. 4B and S6), and indicated that RNA recognition took place from the major groove, as reported for the helix (Figs. 1C and 4C)11,12.

Altogether, these results indicated that both 1a and 1d recognized RRE subdomain IIB and inhibited the RRE-Rev contact by mimicking Rev34–50. This finding is relevant because the majority of synthetic α-helix mimics match residues located on just one helical face2,3,4, and of the few multi-facial mimics described so far, none target a protein–RNA interaction29,30.

The curves of subdomain IIB-terphenyl association obtained with fluorescence intensity experiments at low ionic strength were best fit with a two-site model, and compounds 1a and 1d had low Kd1 and Kd2 values (Fig. 4A and Tables 2 and S1). A requirement of tight binding to both sites may explain why the remaining terphenyl compounds did not block the IIB-Rev34–50 or RRE-Rev interactions as efficiently (Table 1 and Figs. 3 and S3). Molecules with small structural changes relative to 1a and 1d (such as 1b, 1c or 2a-2b) exhibited low Kd1 but higher Kd2 values, indicating that these changes affected terphenyl association to the second site. This could also explain why seemingly small structural differences make such an impact on the activity of these compounds; not only does the terphenyl have to bind well initially to the RRE subdomain IIB (Kd1), but a second molecule must also bind to the RNA-ligand complex to exert significant inhibitory activity. This means that ligand-ligand interactions could be just as important as ligand-receptor interactions in this case, and high activity should be achieved with a balance of the two. On the other hand, compounds 3a-3c and 4, bearing three instead of four 2-aminoethyls, had both higher Kd1 and Kd2 constants (Fig. 4A and Table S1), indicating that four aminoalkyl chains are favoured for tighter binding to subdomain IIB. These RNA affinity differences between compounds were confirmed with fluorescence experiments carried out under higher ionic strength, closer to the ionic conditions present in a cell environment (Table S2 and Fig. S5).

The specificity ratios for subdomain IIB recognition relative to a control RNA molecule containing the TAR bulge ranged between 3 and 25 (Figs. 4A, S4 and S5 and Tables 2 and S2), similar to those obtained for previous 1,3-terphenyl molecules and the Rev34–50 helix itself7, and NMR spectroscopy titrations indicated specific binding to internal loop IIB (Figs. 4B and S6). However, the effect of competitive tRNA on the IIB association curves (Figs. 4A, S4 and S5 and Tables 2 and S2) indicated that due to the presence of positive charge, the specificity of these compounds was lower than that of RRE-Rev inhibitors identified by screening21,22 and other small-molecule RNA binders reported in the literature31,32.

When the viral effect of these compounds was assessed, a clear impact on HIV replication was found, with inhibitory concentrations in the micromolar range. Terphenyls 1a and 1d, but not 1b-c or 2a-b, inhibited HIV-1 replication with low-μM EC50 values (Figs. 5A and S7 and Table 3), suggesting that efficient RNA binding was important for the antiretroviral activity of terphenyl molecules containing four bilateral 2-aminoethyl groups. Transfection experiments involving a full-length provirus suggested that these molecules mainly acted on transcriptional and/or post-transcriptional processes of the viral cycle (Fig. 5B and Table 3). In agreement with this result, qPCR experiments measuring the effect of compounds 1a and 1d on the levels of reverse HIV-1 transcripts did not support a strong action on reverse transcription (Fig. S8). RT-qPCR experiments revealed a tendency of terphenyl 1d to increase the levels of multiple-spliced HIV-1 transcripts relative to single-spliced and unspliced species, an effect consistent with cellular inhibition of Rev function (Figs. 6A and S9). In addition, 1a and 1d inhibited both HIV-1 and HTLV-1 LTR promoter-dependent gene expression (Fig. 6B and Table 3). The higher EC50 obtained for HTLV-1 relative to HIV-1 inhibition indicated that the compounds acted with some specificity on the HIV-1 LTR system. On the other hand, the inhibition of the HIV-1 LTR was likely based on transcriptional blockage, since increasing concentrations of 1a and 1d translated into decreased levels of the viral transcripts quantified in RT-qPCR experiments at 72 hours post-infection (Fig. S9). EMSA experiments analysing compound association to the core region of the HIV-1 LTR promoter revealed a perturbation of the DNA band at relatively low concentrations of terphenyls 1a and 1d, which was not detected for the inactive compound 1b (Fig. S10A). This result supported a mechanism of transcriptional blockage based on LTR DNA association, which according to calorimetry experiments takes place with an affinity approximately 10 times smaller than that for RRE subdomain IIB association (Fig. S10B).

The terphenyl library included a subset of molecules containing three instead of four aminoethyl groups and among these, compounds 3a and 3c had significant antiretroviral activity (Table 3). Compounds 3a and 3c had reduced RNA affinity and RRE-Rev inhibition capacity relative to 1a and 1d (Figs. 3 and S3S6 and Tables 1, 2 and S2), and terphenyl 3c associated only very weakly to LTR DNA (Fig. S10A). This compound blocked LTR-dependent gene expression (Tables 3 and Fig. 6B). However, RT-qPCR experiments indicated that 3c had a weaker effect on HIV-1 transcription relative to 1a and 1d (Figs. 6A and S9), suggesting a different mechanism of action relative to terphenyl molecules containing four aminoalkyl side-chains.

In conclusion, the results compiled in this manuscript indicate that four-arm terphenyls are multi-target agents acting on different steps of the HIV cycle, including transcription and Rev-dependent transport. Additional work will be needed to improve RRE binding affinity and specificity and to increase the antiretroviral activity of the terphenyl scaffold, like modifying the functional groups contained in the lateral chains, or inserting suitable asymmetric substituents in one or more of the benzene rings. Nevertheless, an encouraging property of all of these compounds is their reduced cellular toxicity relative to other antiretroviral RRE-Rev inhibitors21,22. A further potential advantage of these mimics may lie in the fact that the Rev34–50 motif is highly conserved among HIV isolates and serves not only as an RNA association element, but also as a non-canonical NLS and as a hotspot for interaction with cellular proteins33,34,35. This means that improved Rev34–50 mimics could possibly modulate other pathways in addition to RNA transport or splicing, increasing their chances of altering the viral cycle and making the emergence of target-related resistance more difficult. Multi-target compounds36 such as these terphenyl mimics are receiving increased attention in the drug discovery field and may be particularly effective for treating infectious and/or multi-factorial diseases such as AIDS, as they could reduce the likelihood of drug resistance and contribute to simplify current therapies.

Methods

Molecular modeling

The conformational space of the 1,4-terphenyl molecule 1a was sampled using the MMFF94s force field37 and the Stochastic Search option of the MOE software package (CCG Inc.). The minimum-energy conformation was superposed on Rev34–50 -helices obtained from PDB structures 1ETG11, 4PMI12 and 3LPH9, to verify which side chains of the-helix were matched by the bilateral substituents of the terphenyl molecule. Three-dimensional models of a 1:1 complex of loop IIB with 1a were built by docking the ligand into 1ETG and 1ETF11 as well as 4PMI12 RRE subdomain IIB structures using Gold 5.238. For 4PMI, the missing atoms of the A19 base were added using standard geometries. In all cases, the binding site was defined with a large 20 Å radius around nucleotide C20, in agreement with the NMR chemical shift perturbations induced by the ligand. The calculations were unrestrained, employed the GoldScore fitness function38 and generated 20 solutions for each ligand with maximum search efficiency. For the 4PMI RNA structure, the docking run resulted in a converged set of eleven (55%) 1a solutions that had pair-wise root mean square deviations lower than 1.64 Å and included all better-scored poses. We obtained similar results when docking 1a into 1ETG or 1ETF structures.

General methods for the synthesis of 1,4-substituted terphenyl compounds

Reactions were carried out under an inert atmosphere of N2 or Ar using standard Schlenk techniques or sealed tubes, unless otherwise indicated. Solvents were purified prior to use: THF was distilled from sodium and benzophenone and dichloromethane from CaCl2. Reagents were used as supplied by the commercial sources without further purification. The reactions were monitored by TLC using 0.25 mm precoated silica-gel plates. Visualization of the TLC plates was carried out with UV light and/or aqueous ceric ammonium molybdate solution or potassium permanganate stain. Flash column chromatography was performed with the indicated solvents on silica gel 60 (particle size: 0.040–0.063 mm). 1H, 13C and 19F NMR spectra were recorded with a Bruker 300 MHz spectrometer. A QTOF mass analyser system was used for HRMS measurements. Stock solutions were prepared by dissolving each compound in H2O at a concentration of 5 mM. The concentration of all terphenyl stocks was verified by NMR spectroscopy using the ERETIC utility of Topspin 3.5 (Bruker Biospin).

RNA, DNA, peptide and protein samples

The composition and preparation of the following species have been described in detail in previous reports7,21: 28-nt subdomain IIB RNA oligonucleotides IIBh and IIBh-23fl (Fig. 1D), 234-nt RNA sequence RRE (Fig. 3B), 26-nt and 16-nt self-complementary DNA oligonucleotides LTRd and DNAd, full-length protein Rev, unlabelled Rev34–50 peptide revp, and fluorescein isothiocyanate (FITC)-labelled Rev34–50 peptide frevp. Additionally, a TARh-8fl RNA molecule containing a FITC probe linked to extra-helical loop nucleotide U8 (Fig. 1D) was purchased HPLC-purified from Horizon Discovery and desalted; a 58-base pair DNA duplex corresponding to the core region of the HIV-1 LTR promoter (LTRc), was obtained by PCR amplification from a HIV-1 LTR-luc plasmid24 utilizing GGGACTTTCCGCTGGGGAC (forward) and GGCGGGACTGGGGAGTGGC (reverse) primers; and Escherichia coli tRNALys was transcribed in vitro from a BstNI-digested pUC19 plasmid and purified by gel electrophoresis. RRE, Rev and LTRc were used in electrophoretic mobility shift assays (EMSA), and DNAd was employed as a specificity control in these experiments. Unlabelled IIBh was utilized in nuclear magnetic resonance (NMR) spectroscopy and fluorescence anisotropy experiments. IIBh-23fl and TARh-8fl were employed in fluorescence intensity experiments, and tRNALys and LTRd were used as RNA and DNA specificity controls in the fluorescence intensity tests.

Fluorescence anisotropy

These experiments were conducted in a Victor X5 (PerkinElmer) plate reader as described before7,21, using 10 nM frevp and 60 nM IIBh. Each experiment had one positive (a mixture of IIBh and frevp, equivalent to 0% inhibition) and two negative (isolated frevp as well as a mixture of IIBh, frevp and neomycin B) controls. Since the fluorescence of several 1,4-terphenyl compounds was found to interfere with this assay at high concentrations, a baseline correction was performed: anisotropy data of all isolated molecules were generated and subtracted from the signal obtained in the presence of IIBh/frevp at the same concentration values. IC50 values were then calculated with GraphPad Prism using the following sigmoidal inhibitory model:

$$A=\frac{100}{1+{10}^{\log C-\log I{C}_{50}}}$$

where A is normalized anisotropy and C is total concentration of compound. We only quantified with this equation the activity of those compounds that, according to the experiment controls, induced the expected reduction in anisotropy after baseline correction; all other molecules were considered inactive. Each fluorescence anisotropy experiment was repeated at least two times.

Electrophoretic mobility shift assays (EMSA)

The experiments monitoring RRE-Rev inhibition utilized 78 nM full-length RRE and 1.32 μM full-length Rev dissolved in 10 mM HEPES pH 7.5, 300 mM KCl, 1 mM MgCl2 and 0.5 mM EDTA binding buffer, and increasing concentrations of each compound up to 50 μM21,39. The reactions were incubated at room temperature for 20 minutes and loaded onto 8% polyacrylamide gels with TB running buffer. Gels were run at 4 °C for 1–4 hours at 150 V, and the bands were stained with SYBR gold and quantified with Quantity One 4.1 analysis software. Experiments monitoring binding to the 58-base pair HIV-1 LTRc core segment utilized 20 nM LTRc duplex dissolved in the same binding buffer, and increasing concentrations of each compound. The reactions were similarly incubated and loaded onto 20% polyacrylamide gels with TB running buffer. The gels were run at 4 °C for 3.5 hours at 150 V, and the bands were stained and quantified as described above. The specificity of LTRc association was evaluated by duplicating the experiments in the presence of a 100-fold molar excess of DNAd duplex (28-fold base-pair molar excess). In all cases, we monitored the disappearance of the band corresponding to high-order RRE-Rev complexes or free LTRc, and 50% response RC50 values were determined with Prism by fitting the data to a sigmoidal inhibitory model:

$$I-{I}_{min}=\frac{{{\rm{I}}}_{max}C}{1+{\rm{C}}/R{C}_{50}}$$

where I is the intensity of the band corresponding to LTRc or high-order RRE-Rev species at compound concentration C, Imax the best-fit value for maximum intensity, and Imin the minimum intensity obtained at the highest concentration of inhibitor. All EMSA experiments were repeated three times for each compound.

Fluorescence intensity

These experiments measured association to IIBh-23fl or TARh-8fl RNA molecules labelled with fluorescein at extrahelical loop nucleotides U23 and U8, respectively (Fig. 1D), and were carried out under two different ionic conditions in a Victor X5 plate reader, using excitation and emission wavelengths of 485 and 520 nm, respectively. We also attempted to measure association to an alternative IIBh hairpin containing 2-aminopurine instead of adenine at unpaired loop IIB residue A1921, but all terphenyls fluoresced at the excitation wavelength of this fluorophore. IIBh-23fl or TARh-8fl (at 100 nM concentration) was equilibrated for 5 minutes after each ligand addition in a buffer containing either 10 mM sodium phosphate pH 6.6 and 0.1 mM EDTA or 10 mM HEPES pH 7.5, 200 mM KCl and 2 mM MgCl2. In addition to the TARh specificity control, the RNA and DNA specificity of the IIBh interactions was assessed by duplicating the experiments in the presence of a 10-fold molar excess (1 μM) of either tRNALys or DNA duplex LTRd. The equilibrium dissociation constants Kd were determined by fitting the fluorescence intensity curves with DYNAFIT40. We used one-site, two independent-sites and two interacting-sites binding models for all curves, and the best model was automatically selected by model discrimination analysis40, except where indicated. The final graphs were plotted with Prism. All fluorescence intensity experiments were performed at least two times for each compound and condition.

NMR spectroscopy

NMR spectra were acquired in a Bruker Avance III 500 MHz or cryoprobe-equipped Bruker Avance 600 MHz spectrometers, and analysed using Topspin 1.3 (Bruker Biospin) and Sparky 3.11041. The IIBh RNA samples were previously microdialyzed in an aqueous solution containing 10 mM sodium phosphate (pH 6.0) and 0.1 mM EDTA. The interaction of 30–50 μM (5–7 ODs) IIBh, samples with terphenyl compounds was monitored at 27 °C using one- and two-dimensional (TOCSY) experiments at increasing ligand:RNA molar ratios: 1:1, 2:1, and 4:1. The complex of IIBh with 1a was also analysed at 2:1 and 4:1 1a:RNA ratios with NOESY experiments employing a recycle delay of 2 seconds and 600 or 800 ms mixing time.

Isothermal titration calorimetry

These experiments were performed at 25 °C in MicroCal PEAQ-ITC or Nano-ITC microcalorimeters, and the data was subsequently analysed with MicroCal or Nanoanalyze software, respectively. All species were dissolved in aqueous solutions containing 10 mM sodium phosphate (pH 7.4 or 8.2) and 0.1 mM EDTA. For the IIBh:1a interaction the pH was 7.4, and 10 or 20 μM solutions of IIBh in the sample cell were titrated with 19 injections of 350 or 500 μM solutions of 1a. We previously reported that titrations of IIBh with 1,3-terphenyl compounds exhibited a higher affinity transition followed by a complex, lower affinity step7. The IIBh:1a titration experiments focused on the higher affinity step, associated to loop IIB binding as revealed by NMR spectroscopy. For the DNAd:1a interaction, the pH was 8.2, and 800 or 900 μM solutions of 1a were titrated into 20 μM solutions of DNAd duplex in the sample cell with 19 injections of 2 µL. In both cases, the titration experiments were repeated three times, and the resulting association curves were fitted using a model with a single set of binding sites42.

Plasmids, viruses and cells for ex vivo assays

Vectors pNL4.3-Luc and pNL4.3-Ren were generated by cloning the luciferase and renilla genes, respectively, in the nef site of HIV-1 proviral clone pNL4.343. These constructs generate replication-competent viruses as previously shown44. Plasmids pLTR-luc24 and pLTR(HTLV)-luc45 carried a luciferase gene under the control of the HIV-1 or HTLV-1 LTR promoters, respectively. MT-246 and 293 T cells were cultured as described previously21.

Evaluation of anti-HIV-1 activity and cellular toxicity

The methodology used to perform and analyse these experiments has been reported elsewhere7,21,22. Briefly, infectious supernatants were obtained from transfection of plasmid pNL4.3-Ren on 293 T cells, MT-2 cells were infected with these supernatants in the presence of the compounds, and anti-HIV activity quantification was performed 48 h post-infection by determining luciferase activity in cell lysates compared to a non-treated control (100%). Cellular viability was evaluated in mock infected cells similarly treated with the same concentrations of compounds using the CellTiterGlo (Promega) assay. 50% effective (EC50) and cytotoxic (CC50) concentrations were calculated with Prism using log(inhibitor) vs response non-linear regression analyses. The results represent the average of at least three independent experiments.

Quantification of early and late reverse transcription

MT-2 cells were pre-treated with two different concentrations of terphenyl molecules 1a, 1d or 3c, selected on the basis of the observed RRE-Rev IC50 and cellular EC50 values, and infected with NL4.3 wild-type HIV-1 for 5 hours. Total genomic DNA was isolated with a QIAamp DNA blood mini kit (Qiagen) and quantified by spectrophotometry. Early and late viral DNAs were quantified by qPCR as previously described47. qPCR was performed in triplicate in a StepOne Real-Time PCR system using standard cycling conditions. Serial dilutions of genomic DNA from the 8E5 cell line, which contain a single integrated copy of HIV-143, were used as standard curve. The CCR5 gene was used as an endogenous control.

Transfection assays

MT-2 cells were transfected with plasmids containing a luciferase reporter gene whose expression was under the control of the full length proviral HIV-1 (NL4.3-luc), the HIV-1 LTR promoter (pLTR-luc), or the HTLV-1 LTR promoter (pLTR(HTLV)-luc). After transfection, cells were treated with different compound concentrations, and activity quantification was performed 48 h later by determining luciferase activity in cell lysates as described7,21,22. 50% effective (EC50) concentrations were calculated with Prism using log(inhibitor) vs response non-linear regression analyses, and the results represent the average of at least three independent experiments.

Analysis of HIV-1 RNA splicing

MT-2 cells were infected with a NL4.3 virus for 2 hours (10 ng/106 cells) and treated with two different concentrations of compounds 1a, 1d or 3c for 24, 48, 72 or 96 hours. The compound concentrations were chosen on the basis of the observed RRE-Rev IC50 and cellular EC50 values. Total cellular RNA was isolated, treated with DNase I and reverse-transcribed as previously described21,22. Unspliced, single-spliced and multiple-spliced HIV-1 RNA transcripts were quantified by qPCR relative to a control obtained from untreated cells, using the primers described by Mohammadi et al.48, and GAPDH as an endogenous control.