Mapping the native interaction surfaces of PREP1 with PBX1 by cross-linking mass-spectrometry and mutagenesis

Both onco-suppressor PREP1 and the oncogene MEIS1 bind to PBX1. This interaction stabilizes the two proteins and allows their translocation into the nucleus and thus their transcriptional activity. Here, we have combined cross-linking mass-spectrometry and systematic mutagenesis to detail the binding geometry of the PBX1-PREP1 (and PBX1-MEIS1) complexes, under native in vivo conditions. The data confirm the existence of two distinct interaction sites within the PBC domain of PBX1 and unravel differences among the highly similar binding sites of MEIS1 and PREP1. The HR2 domain has a fundamental role in binding the PBC-B domain of PBX1 in both PREP1 and MEIS1. The HR1 domain of MEIS1, however, seem to play a less stringent role in PBX1 interaction with respect to that of PREP1. This difference is also reflected by the different binding affinity of the two proteins to PBX1. Although partial, this analysis provides for the first time some ideas on the tertiary structure of the complexes not available before. Moreover, the extensive mutagenic analysis of PREP1 identifies the role of individual hydrophobic HR1 and HR2 residues, both in vitro and in vivo.


Results
Identification of the intra-and inter-molecular interacting sites in the PBX1-PREP1 and PBX1-MEIS1 complexes. Cross-linking coupled with mass-spectrometry (XLMS) [40][41][42][43][44][45][46] enables visualization of the interacting regions by bridging two proteins with a cross-linker and creating hybrid peptides that are identified by mass-spectrometry following protein dissection by enzymatic proteolysis. The chemical crosslinker covalently connects two functional groups (i.e. lysines) in close proximity and exposed on the protein's outer surface. Since distant or buried residues are not able to react with the cross-linker, XLMS allows to map protein complex interfaces.
In these experiments we used PREP1 and PBX1 constructs lacking the C-terminal region not involved in heterodimerization 3,4 and very prone to degradation in solution 47 . On the contrary, we used the full-length MEIS1 for studying the MEIS1-PBX1 complex. As cross-linker, we used the commercial BS 3 , a homobifunctional cross-linker containing a 11.3 Å spacer, able to link two lysine residues (see "Materials and methods" for details; for the workflow of the experiment see Supplementary Information Figure S1A). Purified PREP1-PBX1 and MEIS1-PBX1 complexes were cross-linked separately with BS 3 (Supplementary Information Figure S1B) prior to in-gel double digestion with Glu-C and trypsin. Cross-linked peptides were analysed by nLC-ESI-MS/MS, identified using the pLink 2 software and manually curated. The results derive from two biological replicates for each protein complex. The same procedure was applied to the individual proteins to identify potential intramolecular cross-linking. All the spectra are deposited at the PeptideAtlas repository (PASS01497).
In the case of PREP1-PBX1, the identified cross-linked peptides are listed in Tables 1, 2, and 3 while the fulllist with the statistical significance is available in Supplementary Information Tables S4-S7. The XLMS analysis revealed 12 unique intra-PBX1, 3 intra-PREP1 and 12 inter-protein cross-links, in addition to 6 loop-linked sites Scientific RepoRtS | (2020) 10:16809 | https://doi.org/10.1038/s41598-020-74032-w www.nature.com/scientificreports/ (2 from PBX1 and 4 from PREP1). With the MEIS1-PBX1 dimer a total of 24 non-redundant cross-linked sites was obtained: 8 inter-, 11 intra-molecular cross-linked sites and 5 loop-linked sites (see Tables 4, 5 and 6 for the  summary of peptides identified, and Supplementary Information Tables S8-S11 for the full list of redundant peptides and their statistical significance).
To visualize the regions of interaction, since no high-resolution structural data on full-length PREP1, MEIS1 or PBX1 are available, we mapped the cross-linked peptides on linearized molecules (Fig. 1), also evidencing the reactive lysines ( Supplementary Information Figures S2A, S2B, and S2C). In Fig. 1, thick lines represent Table 1. Summary of PBX1 and PREP1 intermolecular peptides. The number under brackets are the lysine residues cross-linked, in the first column refers to the residue in the full-length protein, in the second column to the residue in the peptide. The peptides here reported correspond to peptide identified in two biological replicates of the XLMS experiment.

Scientific RepoRtS
| (2020) 10:16809 | https://doi.org/10.1038/s41598-020-74032-w www.nature.com/scientificreports/ cross-links identified in both experimental replicates while thinner ones refer to those detected in single experiments, filtered with the stringent criteria defined in the Methods section. The XLMS analysis identified in PREP1 the following cross-linked lysines: K55(HR1), K151 and K153(HR2) and K268 (homeodomain); those identified in PBX1 were K87(PBC-A) and K153 and K195 in PBC-B. These cross-linked residues connect the HR1 and HR2 domains of PREP1 to PBC-A and PBC-B of PBX1, respectively. In addition, K87 on PBX1 was also found cross-linked with K268 of PREP1, suggesting a special proximity of PBC-A to the homeodomain of PREP1. Likewise, the homeodomain of PBX1 was found cross-linked with the homeodomain [(K242(PBX1)-K268(PREP1)] and with the HR2 domain (K266(PBX1)-K153(PREP1) of PREP1. The C-terminal residues K297(PBX1) and K308(PBX1) were found cross-linked with several PREP1 lysines mainly located in the central part of the protein (K99, K134, K140, K153, K268 and K335), possibly reflecting a flexibility of this region. Altogether the data suggest that the main interacting regions in the PREP1-PBX1 complex are included in the HR1-HR2 domains of PREP1 and in the PBC-A-PBC-B of PBX1.
In MEIS1, the HR2 domain (K161, K178 and K195) was cross-linked with the PBC-A (K65 and K87), the PBC-B (K153), the homeodomain (K242) and with the C-terminal region (K297 and K308) of PBX1. No MEIS1 residue in the HR1 cross-linked with PBX1, suggesting that the HR1 domain of MEIS1 is not or less involved in the heterodimerization. We thereby conclude that in the MEIS1-PBX1 complex, the HR2 of MEIS1 and the PBC-B of PBX1 are the domains mostly involved in the formation of the heterodimer. It must be noticed that K195 in the PBC-B of PBX1 was not found in any inter cross-linked peptide with MEIS1, while it was very reactive in the complex with PREP1, in particular with the PREP1 homeodomain (K268). The lack of reactivity of the homeodomain of MEIS1 might be due to an inherent limit of the technology. Indeed, our XLMS analysis revealed   (7) PBX1 (297)-PBX1(308) NIGKFQEE(4)-ANIYAAK (7) PBX1 (242)-PBX1(308) GGSKSDSEDITR(4)-ANIYAAK (7) PBX1 (195)-PBX1 (308) TRPISPKE(7)-ANIYAAK (7) PBX1 (87) Figure S2B and Table 6), and therefore must be exposed and reactive. On the other hand, the homeodomain region of MEIS1 is very rich in K, R, E, and D (see Supplementary Information Figure S2B) which would lead to an extreme fragmentation of peptides upon digestion with trypsin and Glu-C. For these reasons, any intra or inter-protein interaction occurring C-terminally of the HR2 (K195) of MEIS1 might be invisible by XLMS. Therefore, we have used an alternative approach (TR-FIA analysis with PBX1 deletion mutants, see below) to gain information on the importance of 197-207 stretch of PBX1 in the interaction with MEIS1 (see below).
Assuming that the proteins under study are in a heterodimeric form, intramolecular cross-links identified by XLMS are also useful to examine the entire protein folding and to further validate the intermolecular connections. In our experiments, we find that PBX1 retains the same overall folding by binding to PREP1 and to MEIS1. In fact, in both cases we detect the same intramolecular peptides (see Tables 2 and 5 and Supplementary  Information Tables S5, S7, S9, and S11). Also, a visual inspection of the PBX1 cross-linked peptides position suggests that PBC-A is localized near PBC-B and the homeodomain. To make the results of the experiments clearer to the reader, we have designed the interacting regions in a cartoon form (Supplementary Information Figures S1C and S1D).
Potential PBX1-interaction interfaces of PREP1 and MEIS1. Given the previous literature data on the PREP1-PBX1 and MEIS1-PBX1 interactions 20,35,36,38,39,47 , it is clear that PBX1-PREP1 and PBX1-MEIS1 complexes are held together by hydrophobic interactions through leucines and isoleucines located in their N-terminal moiety in regions called HR1 and HR2. These regions are almost identical in PREP1 and MEIS1 (Supplementary Information Figures S2A and S2B) (20/25 residues in HR1 and 18/22 in HR2). We noticed in the PREP subfamily a third conserved leucine-rich region upstream of the homeodomain that we called HR3. This region, absent in the MEIS subfamily, lacks lysine residues in PREP1 and therefore is not found among the cross-linked hybrid peptides (Table 1). However, it is close to K268 which was found cross-linked to PBC-B ( Fig. 1A and Table 1). Therefore, its involvement in the interaction with PBX1 cannot be excluded by considering only the cross-linking results. www.nature.com/scientificreports/ The secondary structure of PBX1 predicts the PBC-A and PBC-B to be mainly helical; three leucine/isoleucine rich repeats highly conserved among PBX isoforms are present in the PBC-A and one in the PBC-B (Supplementary Information Figure S2C).
Identification of residues in the hydrophobic α-helical interphases of the HR1 and HR2 domains of PREP1 that are essential for dimerization with PBX1.. HR1 and HR2 in the PREP subfamily contain each a predicted helix-turn-helix motif, with hydrophobic stretches rich of leucines and isoleucines ( Fig. 2A). The leucines and isoleucines heptad repeats are 80% conserved in the MEIS subfamily. Secondary structure predictions suggest the occurrence of α-helical conformations within the HR1 and HR2 regions of PREP1 and MEIS1, represented in cartoon form in Figs. 2A and 2B. In PREP1, positions "a" of the first heptad repeat of HR1 are occupied by L63 and L70 residues, and positions "d" by L66 and K73. The whole amphipathic helix counts additional hydrophobic residues (L67, L69, F71 and F64) surrounding positions "a" and "d". We have therefore systematically mutated each of the hydrophobic residues within the heptad repeats substituting them The areas in magenta represent predicted alpha-helical regions within PREP1. The heptad repeats of HR1 and HR2 domains are represented as alpha-helical turns above the linear sequence of the protein, highlighting the hydrophobic residues pointing toward the same face of the helix. The position and sequence of the hydrophobic "HR3" stretch are indicated. Panel B: Mapping of the heptad repeats within HR1 and HR2 of MEIS1. Predicted alpha-helical regions are represented as light-blue areas above the linear sequence. Panel C: Binding of recombinant wild-type and mutants PREP1 to recombinant wild-type PBX1 measured by Time-resolved fluorescence immunoassay (TR-FIA). Single mutations analysed in the assay are indicated; HR1m is the quadruple L63A/L66A/L67A/L70A mutant; HR2m is the quadruple I122A/L125A/ L129A/L132A mutant. HR1, HR2 and "HR3" domains are mapped below the bars for clarity. The Y axis reports the binding to PBX1 of the PREP1 mutants, normalized to wild-type PREP1 taken as 1.0. The results represent the average of four independent assays, + /− the standard deviation. T-TEST was performed as described in the Methods section. Panel D ELISA determination of the affinity between N-terminal GST-fused wild-type PREP1 (black line), GST-fused wild-type MEIS1 (blue line), GST-fused HR1m (L63A/L66A/L67A/L70A, red squares) and HR2m (I122A/L125A/L129A/L132A, green triangles) PREP1 mutants. Excess proteins were added to plates coated with increasing concentrations of wild-type PBX1. The values on the Y axis are the signals obtained with the individual proteins subtracted of the GST used as control. Results represent the average of three independent assays + /− the standard deviation.
In HR2, positions "a" and "d" within the L117-L132 heptad, are occupied by the hydrophobic residues I122 and L129 in position "a" and L125 and L132 in position "d". We thus mutated these and other hydrophobic residues in positions "b" and "c" (L117A, V124A and L130A) into alanine. Among all the mutants, L129A and L132A which both sit in the second turn of the helix, most affected PBX1 binding; in particular, L132A reduced the binding to 40%.
We also mutated four hydrophobic residues within the HR3 domain, but none of them affected significantly the binding of PREP1 to PBX1 (Fig. 2C).
To evaluate the effect of mutating the entire set of residues in HR1 and in HR2 that reside on the same side of the α-helix, we generated the quadruple mutants HR1m and HR2m bearing the L63A/L66A/L67A/L70A and I122A/L125A/L129A/L132A mutations, respectively. As shown in Fig. 2C, HR1m bound PBX1 with the same efficiency as the single mutants, while binding of HR2m was further reduced to 30%. Overall, the data indicate that both HR domains of PREP1 are strongly involved in the recognition of PBX1. However, it appears that residues I122, L125, L129 and L132 of HR2, likely arrange to form a large and continuous hydrophobic patch that covers two helix turns, playing a relevant role for the interaction.
On the basis of these observations, we next measured the affinity for PBX1 of recombinant wild-type PREP1 and mutants HR1m and HR2m. For comparison, also the affinity of wild-type MEIS1 was determined under the same experimental conditions. Affinity was measured by dose-dependent ELISAs, by coating the different proteins on plate wells and probing the interaction with soluble PBX1. As shown in Fig. 2D, soluble PBX1 bound both PREP1 and MEIS1 although with different affinities: the K D of PBX1 for PREP1 was estimated to be 18 ± 4 nM, the K D of MEIS1 was almost fivefold higher (80 ± 17 nM), in agreement with the results of XLMS that indicate two major sites of interaction for PREP1 and only one for MEIS1. Importantly, in line with the mutagenesis data, PREP1 HR1m and HR2m mutants displayed no measurable binding for PBX1 under the experimental conditions used here.
We used CD spectroscopy on the wild-type protein and on the HR1m and HR2m mutants to test whether the mutations had grossly modified the global protein fold. Differences were observed in the CD spectra of HR1m and HR2m compared to the wild-type protein (Supplementary Information Figure S3D); however, the spectra were changed only in terms of intensity and not in terms of intrinsic CD signals, indicating that the secondary structure of the proteins was globally maintained. In fact, the CD spectrum of PREP1 wild-type showed the typical signatures of α-helix-rich structures with two minima centred at ~ 208 and ~ 222 nm, and a maximum at ~ 195 nm, in agreement with data reported in the literature 48 . Similarly, the conformational analysis of HR1m and HR2m quadruple mutants of PREP1 indicated a persistence of the α-helical structures with minima at ~ 208 and ~ 222 nm, and a positive peak at ~ 195 nm. Data thus show that the conformation of the three proteins, was globally preserved also in the presence of four mutations, in agreement with previous studies showing that PREP1 is endowed with a remarkable structural stability 48 and possesses a high tolerance to conformational changes.
In conclusion, the CD data allow to conclude that the deficiency in PBX-interaction of PREP1 mutants is due to the substitution of the hydrophobic residues and not to an overall unfolding effect.
The contribution of the PBC domain of PBX1 in the interaction with PREP1. It has been previously reported that partial deletions within PBC-A of PBX1 affect the binding to PREP or MEIS 2,49 . PBX1 contains four leucine/isoleucine rich-stretches arranged in heptad repeats, three in PBC-A and one in PBC-B (Fig. 3A). We have generated deletion mutants lacking each entire hydrophobic stretch together with a number of single and double substitution mutants, and tested them for binding to PREP1 by TR-FIA. The region I43-D55 contains two heptad repeats, with the "a" and "d" positions of the amphipatic helix occupied by isoleucines (I43, I53) that create a hydrophobic interface with L46, I50. Positions "c", "f " and "g" are occupied by polar residues (D45, Q48, Q49, T52, D55, and Q56), while in position "e" a leucine residue (L47) is present, highly conserved within the PBX family (Supplementary Information Figure S2C). Binding of PBX1Δ43-55 to PREP1 and MEIS1 was decreased to 80% and 60%, of wild-type, respectively ( Fig. 3B and 3C). The second stretch (L77-L90) has a single heptad repeat, with L77 in position "a" and V80 in position "d" or, alternatively, with L81 in position "a" and I84 in position "d". In fact, the whole stretch is rich in hydrophobic residues (L77, F78, V80, L81, I84, V89, L90; see Supplementary Information Figure S2C) and might constitute an apolar interface for the interaction with PREP1/MEIS1. Indeed, deletion Δ77−84 reduced the binding to both PREP1 and MEIS1 to 35%. In the third stretch, the leucine residues in this region (L105, L108, L112) occupied all the "a" or "d" positions of the helix (see Fig. 3A). Δ105-113 deletion mutant was able to decrease the binding to PREP1 to 75% but had no effect on MEIS1. This is consistent with the XLMS data, where we could not observe any PBX1 peptide from the 105-113 stretch cross-linking PREP1 or MEIS1. In PBC-B, a single hydrophobic heptad repeat is present with V201 in position "a" and I197 and I204 in position "d". The PBX1Δ197-204 deletion (Fig. 3B) reduced to about 65% the binding of PREP1 and to 45% that of MEIS1. Figure 3D shows the effect of single substitutions in PBX1. In the 43-53 stretch, substitution of single (and some double) hydrophobic isoleucine residues gave variable results but all mutants bound in the range of the wild-type. Thus, hydrophobic residues within this stretch appear to have a partial, if any, role in the interaction with PREP1, in agreement with the results on the deletions (Fig. 3B). In the second stretch, single mutations of most hydrophobic residues (V80, L81 and I84) suppressed significantly the binding to PREP1 to 40-60%;
In PBC-B, substitution of the single isoleucine residues (I197A, I204A or I211A) reduced the binding to PREP1 only moderately (to about 65%). Substitution of both 204 and 211 residues had a stronger effect (reduction to about 50%).
In conclusion, the interaction of PBX1 with PREP1/MEIS1 occurs not only through PBC-A (43-55 and 77-84 stretches), but also through PBC-B (residues 197-204). However, we can hypothesize that the most important interaction of the PBC-B region 197-207 occurs with the homeodomain of PREP1 and MEIS1 that in MEIS1 is invisible by XLMS due to the intrinsic limit of the technique. The Y axis reports the binding to PREP1 of the PBX1 deletion mutants, normalized for the binding of wild-type PBX1 taken as 1. The results represent the average of three independent assays, + /− the standard deviation. The extent of deletions is indicated below the ordinates bar. Panel C: TR-FIA-measured binding of wild-type and mutant PBX1 to wild-type MEIS1. The Y axis reports the binding to MEIS1 of the PBX1 deletion mutants, normalized for the binding of wild-type PBX1 taken as 1. The results represent the average of three independent assays, + /− standard deviation. Panel D: TR-FIA-measured binding of wild-type and point mutants of PBX1 to wild-type PREP1. Single and multiple mutations analysed are indicated below the bars. PBC-A and PBC-B are mapped below the ordinates bar. The Y axis reports the binding to PREP1 of the PBX1 mutants, normalized for the binding of wild-type PBX1 taken as 1. The results represent the average of four independent assays, + /− the standard deviation. T-TEST was performed as described in the Methods section.

Scientific RepoRtS
| (2020) 10:16809 | https://doi.org/10.1038/s41598-020-74032-w www.nature.com/scientificreports/ Structural data obtained by CD spectroscopy on these proteins again show that mutations do not alter the overall protein fold and that the reduced affinity derives mostly from side chain replacements and not from impairment of the global protein folding (Supplementary Information Figure S3).
The hydrophobic residues of both HR1 and HR2 domains of PREP1 are essential to drive PREP1 to the nucleus. Dimerization with PBX1 allows PREP1 and MEIS1 to reach the nucleus and suppresses PBX1 nuclear export through its nuclear export signal 50 . PREP1 has no nuclear localization signal, and no nuclear localization signal has been reported for MEIS, except the homology with Drosophila Hth 51,52 . We have therefore tested the ability of PREP1 quadruple mutants to reach the nucleus. We infected lung carcinoma A549 cells, which express low levels of PREP1 3 , with a retroviral vector overexpressing GFP-tagged wild-type or PREP1mutants, and created stably expressing lines. We compared wild-type PREP1-GFP, the previously tested PREP1-GFP ∆HR1 + 2 and PREP1-GFP ∆HD, to the PREP1-GFP HR1m and HR2m mutants described above. All PREP1-GFP-tagged constructs were successfully expressed in A549 cells as judged by immunoblotting (Supplementary Information Figure S4A) and flow-cytometry (Supplementary Information Figure S5). Immunoblot of total lysates of A549 cells overexpressing PREP1 ∆HR1 + 2, HR1m and HR2m showed a lower level of PBX1 (Supplementary Information Figure S5B), with respect to cells overexpressing wild-type PREP1 or PREP1∆HD, in line with PBX1 stabilization by association with PREP1 53 . Therefore, we tested the interaction of the PREP1 mutants with PBX1, by measuring the amount of PREP1-bound PBX1 by GFP pull-down and PBX1 immunoblotting (Fig. 4B). As the PBX1 expression levels are uneven in cells expressing the different PREP1 constructs, before pull-down we incubated the cell lysates with 1 µg of purified recombinant human PBX1. As expected on the basis of previous observations 4 , PREP1 wild-type and PREP1 ∆HD efficiently pulled-down recombinant PBX1 (respectively lanes 7 and 12 of Fig. 4B), while PREP1 ∆HR1 + 2 and the two quadruple point mutants HR1m and HR2m did not (lanes 8, 9 and 10 of Fig. 4B).
Next, we used fluorescence microscopy to determine the subcellular localization of the transfected PREP1-GFP, wild-type and mutants (Fig. 4C). As expected, fluorescence localized almost uniquely to the nucleus with wild-type PREP1, whereas with ΔHR1 + HR2, HR1m or HR2m it was largely found in the cytoplasm. Normalized values of the nuclear/cytoplasmic ratios of PREP1-GFP is shown in Fig. 4D.
Therefore, we conclude that residues L63, L66, L67 and L70 in HR1, and I122, L125, L129 and L132 in HR2 are required to drive PREP1 to the nucleus and therefore must be the main contact points between PREP1 and PBX1 as their replacement with alanine has the same or greater functional effects as the removal of the entire domains.
To further confirm this conclusion, we used proximity ligation assays (PLA). This assay allows the in situ detection of target molecules which are in close proximity (< 40 nm) to each other. The method consists in a first step in which single-stranded oligonucleotides are conjugated to specific antibodies to recognize the target proteins; subsequently the signal is amplified by PCR and then DNA is hybridized with complementary oligonucleotides labeled with a fluorophore. Therefore if the target molecules are in sufficient proximity to each other, the DNA signal is amplified and PLA will produce a robust, specific, and visible fluorescence signal 54 . In our PLA set-up, the contiguity of the anti-GFP and anti-PBX1 antibodies becomes visible as red foci. Quantification of the number of nuclear PLA foci showed their almost complete absence in cells transfected with PREP1-GFP mutants HR1m and HR2m (comparable to background PLA foci, obtained with expression of GFP-alone, Fig. 5A) as opposed to their presence in the case of the wild-type PREP1 and of the ΔHD mutant. Specific examples of immunofluorescence of A549 cells transfected with wild-type PREP1 and the various PREP1 mutants are shown in Fig. 5. PREP1∆HD, which only lacks the homeodomain, retains the ability to form nuclear PLA foci with PBX1 indicating an efficient association between the two proteins ( Fig. 5B and G).
Altogether, these data corroborate and strengthen the view that the hydrophobic residues substituted in HR1m and HR2m mutants are required for the binding to PBX1.

Discussion
The results presented in this work confirm the overall type of interaction of PBX1 with PREP1 or MEIS1. Furthermore, the present data provide novel information on the three-dimensional organization of the domains in the heterocomplexes, highlight the relative importance of individual domains and residues involved in the interactions and suggest protein areas that might be targeted to specifically inhibit MEIS1 or PREP1 heterodimerization with PBX1.
Suggestive spatial organization. Chemical cross-linking analysis strengthens the requirement of the hydrophobic regions of PREP1 and MEIS1 in the interaction with PBX1 37,55 . In addition, it provides an interesting, although very superficial, information on how the individual domains must be organized in space in the heterodimeric molecules (see cartoons, Figure S1C and D). For example, HR1 and HR2 of PREP1 must be located in proximity of PBC-A and PBC-B, respectively; in addition, the homeodomains of PREP1 and PBX1 must be close in space to HR2 and PBC-B (Fig. 1). Indeed, XLMS shows that K65 and K87 (PBC-A) and K195 (PBC-B) are intramolecularly cross-linked with the K242 (homeodomain); likewise, in PREP1, HR2 is intramolecularly cross-linked with its own homeodomain through K134 and K268 (Fig. 2). On the contrary, intramolecular crosslinks of MEIS1 homeodomain to its HR1 and HR2 are not evident from our data.

Relative importance of the individual domains.
Our cross-linking data underscore a functional difference between MEIS1 and PREP1 HR1 domains; in the interaction with PBX1, HR1 appears to be less involved in MEIS1 than in PREP1. The lesser involvement of HR1 in MEIS1 may justify the difference in affinity for PBX1 between PREP1 (18 ± 4 nM) and MEIS1 (80 ± 17 nM).

Scientific RepoRtS
| (2020) 10:16809 | https://doi.org/10.1038/s41598-020-74032-w www.nature.com/scientificreports/ The experiments presented allow to identify the important residues involved in the binding of PREP1 to PBX1, as shown by the mutations HR1m and HR2m. We have excluded by CD analysis that the strong effect we observe in quadruple mutant of the HR2 of PREP1 is due to a grossly altered conformational change. In addition, mutational analysis of PREP1 excludes any contribution of the "HR3" domain (stretch 242-252) in the binding to PBX1.
Experimental evidences in the literature demonstrate a contribution only of PBC-A in the PREP1 and MEIS1 interactions. Even though the present work is mostly focused on PREP1, we also derive information demonstrating for the first time a direct involvement of PBC-B in the interaction. Indeed, XLMS underlines a solid connection between PBC-B and the HR2 domains of both PREP1 and MEIS1 through K153 of PBX1 (Figs. 1A  Figure S4D) stained with different antibodies. The top part shows the anti-vinculin staining, the middle part the anti-GFP and the bottom part the anti-PBX1 staining, as indicated. In all cases, lanes 1-6 are the pull-down inputs; lanes 7-12 the pull-down results. All lysates were supplemented with a total of 1 µg of purified recombinant PBX1. Lanes 1 and 7: lysates of PREP1-GFP wild-type expressing cells; lanes 2 and 8, lysates of PREP1-GFP L63A/L66A/ L67A/L70A expressing cells (HR1m); lanes 3 and 9 lysates of PREP1-GFP I122A/L125A/L129A/L132A (HR2m) expressing cells; lanes 4 and 10 lysates of PREP1-GFP ΔHR1 + 2 expressing cells; lanes 5 and 11 lysates of cells expressing GFP alone; lanes 6 and 12 lysates of PREP1-GFP ΔHD expressing cells. Panel C: A549 cells were transfected with a plasmid encoding wild-type, or mutants (HR1m, HR2m, ΔHR1 + HR2, or ΔHD) PREP1-GFP. The figure shows illustrative confocal immunofluorescence microscopy images in which the nucleus is identified by DAPI fluorescence and PREP1 by GFP immunofluorescence. The transfected constructs are indicated on the left. PREP1-GFP is in green, DAPI in blue, phalloidin in red. The merged DAPI, GFP and red channels are shown on the right. PREP1-GFP mutants reveal a cytoplasmic localization, compared to the exclusively nuclear localization of PREP1-GFP wild-type. Panel D: Bar graphs showing the intracellular distribution of wild-type and mutants PREP1-GFP. Plotted is the normalized nuclear/cytoplasmic GFP quantification, using a minimum 32 individual cells for each construct.The number of cells quantified is reported at the bottom of each histogram bar. T-TEST was performed as described in the Methods section.  www.nature.com/scientificreports/ In conclusion, based on the evidences obtained by multiple methods including XLMS (Fig. 1), mutagenesis and binding studies (Figs. 2 and 3), proximity ligation assays (Fig. 5) and immunofluorescence analysis of the subcellular localization of mutant proteins (Fig. 4), we have therefore: 1) identified two regions in PREP1 whose mutation prevents the interaction with PBX1, and through mutagenesis we have underlined which specific residues are involved; 2) outlined for the first time a functional difference between the HR1 domains of MEIS1 and PREP1; 3) highlighted a prevalent role played by PBC-B in the binding to MEIS1. All these findings might be potentially exploited to pharmacologically target and inhibit MEIS1-PBX1 complex.
Possible applications in drug discovery. A more detailed knowledge of the PREP-PBX and MEIS-PBX interaction surfaces opens up the possibility of interesting applications in drug discovery. The formation of PREP1-PBX1 and MEIS1-PBX1 dimers occurs in the cytoplasm and allows their transport to the nucleus where they can bind DNA to exert their physiological role. As PREP1 has no nuclear localization signals 27 , the interaction with PBX1 is essential to reach the nucleus. The same appears to be true for MEIS1. Therefore, disruption of the interaction with PBX1 can block PREP1 and MEIS1 nuclear translocation, and consequently their function. This can be exploited in therapy, for example in the Mixed Lineage Leukemia in which the PBX-MEIS complex appears to have a very important role, together with HOXA9 21,22,56 .
ChIPseq studies in the mouse embryo and cell lines have previously shown that the binding specificity of PBX1 depends on the nature of the interactor, i.e. MEIS1 or PREP1 7 . One of the most important and frequent partners of PBX and MEIS are the HOX proteins, that are essential in determining cell identity but also very important in leukaemia, in particular the HOX paralog group 9. MEIS-PBX and PREP-PBX complexes participate in trimeric DNA-binding complexes with anterior members of the HOX group 37,55,[57][58][59] , because PBX and HOX interact at the level of the homeodomains 60,61 . Since MEIS1 or PREP1 interact with PBX1 at the amino-terminal side 37,55,62-64 , a ternary complex can be formed. Indeed, ChIPseq has shown that a large number of MEIS1 DNAbinding sites coincides with those of HOXC9 while this association is much less for PREP1 7 . Therefore, interfering with the formation of the MEIS-PBX complexes might also affect the activity of the HOX paralog group 9 proteins that are involved in cancer and leukaemia. Furthermore, the role of HOX proteins in increasing the oncogenic effect of MEIS1 is likely to also require PBX as the latter is essential in HOX proteins DNA-binding specificity 65 . Therefore, the disruption or the prevention of MEIS-PBX dimers and of MEIS-HOX-PBX trimers must be considered a therapeutic goal. To date, the only effective inhibitors of HOX-PBX binding are the HXR9 peptide and its derivatives, originally derived from the HOXA9-PBX1 interacting peptide 66 . HXR9 was shown to inhibit the growth of a range of tumours in mouse xenograft models, including non-small cell lung, breast, ovarian, and prostate cancer, mesothelioma, melanoma, and meningioma [66][67][68][69][70][71][72] . HXR9 is supposed to act by preventing the binding of HOX-PBX complexes to the DNA. This type of action requires that a drug must be able not only to enter the cell but also localize itself in the nucleus. Therefore, targeting the formation of MEIS-PBX complex in the cytoplasm might be more efficacious. It must be noticed that the use of the HXR9 in clinics has not been reported. Moreover, targeting the MEIS homeodomain by using small molecule derived from compound collections screenings seems effective 73 , but does not impair dimerization with PBX nor nuclear localization of MEIS.
Finally, one must consider that the stability of MEIS and PBX proteins depends on their ability to dimerize. The overall level of PBX1 is decreased in the Prep1 i/i embryos 14 . PBX2 half-life is increased in cells overexpressing PREP1 because it prevents its proteasomal degradation 53 . Likewise, in mice, PBX3 protects MEIS1 from proteasomal degradation, increases MEIS1 affinity for HOXA9 and induces endogenous Meis1 transcription 20 . Therefore, the prevention of dimerization might have the additional advantage of decreasing also the level of the monomeric proteins.
In conclusion, our work identifies for the first-time an additional potential target for preventing the formation of a MEIS-PBX dimer. As this interaction would occur in the cytoplasm, a drug affecting the heterodimer formation would not only prevent its transcriptional activity, but also leave MEIS and PBX in monomeric form in the cytoplasm, i.e. unable to reach the nucleus, and hence be directed to degradation. Such drugs might be more performing than those acting on the binding to DNA, because the inability to reach the nucleus is likely to induce PBX instability 14 and this would increase even more a possible therapeutic effect.

Materials and methods
Protein alignment and secondary structure prediction. Protein alignments were done using CLUSTAL W 74 , and secondary structure prediction using NetSurfP-2.0 75 .
Single-point mutants and multiple mutants were prepared by PCR using primers listed in Supplementary Information Tables S1 and S2. Deletion mutants were cloned using overlap PCR, as described in 76 . All constructs were verified by sequencing them in both directions.
Cell culture and retroviral infection. A549 cells were grown in DMEM (Lonza) + 10% FBS + 2 mM L-glutamine. For the retroviral transduction, Phoenix-Ampho cell line was used. Briefly, Phoenix-Ampho cells were transiently transfected with the retroviral vectors pBABE-puro described above by standard calcium-phosphate protocol. Cell supernatants containing retroviruses were filtered using 0.45 µm filters and employed to perform two rounds of infection. Two days after infection, cells were selected by addition of puromycin (1,1 µg/mL) to the culture medium. To check for the expression of the PREP1-GFP-constructs, 5 days post-infection A549 cell lines were assayed by western blot analysis of cell lysate using anti-GFP antibody (AbCam #Ab290) (Supplementary Information Figure S4A) and by flow-cytometry analysis (Supplementary Information Figure S5). Flow cytometry instrument used was FACSCalibur (Becton Dickinson), and the data were analyzed using Kaluza Analysis Software (Beckman Coulter). For PBX1 expression in western blot we used anti-PBX1 (Cell Signaling #4342) ( Supplementary Information Figures S4B).

Expression in E. coli and purification of recombinant proteins.
Expression and purification of PREP1, MEIS1 and PBX1 were carried out as described previously 47 . For ELISA assay, GST-free proteins were run on SDS-PAGE (Supplementary Information Figure S6) and then harvested and stored at − 80 °C until use. GST-free PBX1-PREP1 full-length complex was loaded into ion exchange buffer (20 mM Tris pH 7.4, 0.1 M NaCl, 10% glycerol, 0.5 mM EDTA, 0.5 mM EGTA and 1 mM dithiothreitol (DTT)) to a final NaCl concentration of 0.1 M and run on a Resource Q (GE Healthcare) anion exchange column. C-terminal truncated (PBX1 1-308 -PREP1 1-344 ) complex was run on a Resource S (GE Healthcare) cation exchange column. In both cases the proteins were eluted using a 0.1-1.0 M NaCl gradient. Proteins used for cross-linking and proteolysis analysis were further purified by size exclusion chromatography on a Superose 6 10/300 column (GE Healthcare) equilibrated in 20 mM Tris pH 7.4, 0.3 M NaCl, 5% glycerol, 0.5 mM EDTA and 1 mM DTT and eluted at a flow rate of 0.2 ml/min.

Cross-linking mass-spectrometry. Sample preparation and protein digestion. Purified PBX1-PREP1
complex at 1 mg/ml was dialyzed 2 h at room temperature in 50 mM borate-buffered saline. After dialysis, 1 mM Bis(sulfosuccinimidyl) suberate sodium salt (BS 3 ) (Sigma-Aldrich) was added. Cross-linking reaction was performed for 1 h at room temperature with shaking at 1000 rpm as described in 78 . Reaction was stopped by adding 100 mM sodium bicarbonate (final concentration) and incubated at room temperature for 10 min. Cross-linked proteins were separated by SDS-PAGE electrophoresis and stained by coomassie. Gel slices were excised from the gel, reduced with 10 mM DTT, alkylated with 55 mM Iodoacetamide (IAA) and finally digested overnight with Glu-C and then with Trypsin 79 , before peptides were desalted on StageTip C18 80 .
nLC-ESI-MS/MS analysis. Samples were analyzed as described in 81 with few exceptions. In particular, 4 µL of peptide mixture from each sample were analyzed on a nLC-ESI-MS/MS quadrupole Orbitrap QExactive-HF mass spectrometer (Thermo Fisher Scientific). Peptides separation was achieved on a linear gradient from 95% solvent A (2% ACN, 0.1% formic acid) to 60% solvent B (80% acetonitrile, 0.1% formic acid) over 48 min, and from 60 to 100% solvent B in 2 min at flow rate of 0.25 µL/min on a UHPLC Easy-nLC 1000 (Thermo Fischer Scientific) connected to a 23 cm fused-silica emitter of 75 µm inner diameter (New Objective, Inc. Woburn, MA, USA), packed in-house with ReproSil-Pur C18-AQ 1.9 µm beads (Dr. Maisch GmbH, Ammerbuch, Germany). MS data were acquired using a data-dependent top 15 method for HCD fragmentation. Survey full scan MS spectra (300-1800 Th) were acquired in the Orbitrap with 60,000 resolution, AGC target 3 e6 , IT 20 ms. For HCD spectra, resolution was set to 15,000 at m/z 200, AGC target 1 e5 , IT 160 ms; NCE 28% and isolation width 1.4 m/z. Data analysis. For identification, raw data were processed with pLink 2 ver. 2.3 82 searching against the database containing the sequences of PREP1 1-344 -PBX 1-308 , and MEIS1 1-390 -PBX1 1-308 ; Glu-C and trypsin specificity was indicated and up to five missed cleavages were allowed. BS 3 was selected as cross linker, cysteine carbamidomethyl was used as fixed modification and methionine oxidation as variable. Mass deviation for MS/MS peaks was set at 20 ppm. Filter tolerance was set at 10 ppm, while the FDR was set to 0.05, corresponding to an E-value of 0.001. For manually curated analysis only those spectra with at least 4 consecutive b or y ions for each crosslinked peptide were considered as confidently identified 83 . The full-list of cross-linked peptides and relatives E-values are available in supplementary materials.
In manually curated analysis, only those spectra with E-value < 0.001 (corresponding to FDR < 0.05) were considered confidently identified; moreover, we required the majority of the ions of the spectrum assigned and 4 consecutive b or y ions for each cross-linked peptide observed 83  to the NHS-LC-biotin (Thermo-Scientific). The values reported represent the means of four independent experiments each conducted in triplicate. Means and standard deviations were calculated in Excel, which was also used to plot the graphs (we applied STDEV.P function, used in calculating the standard deviation for an entire population). The data are reported as normalized to wild-type protein binding (taken as 1). Type 2 (homoscedastic)T-TEST was performed using one-tail function, to determine if there was a significant difference between each mutant binding datasets against the wild-types binding datasets. We assigned * for values below 0.05 (90% significance), ** for values below 0.025 (95% significance), *** for values below 0.01 (significance 98%), **** for values below 0.005 (99% significance), and ***** for values below 0.001 (> 99% significance).
In situ proximity ligation assay (PLA). For in situ proximity ligation assay (PLA, Duolink, Sigma), cells were labeled according to the manufacturer's instructions, and as described in 54 . Briefly, A549 cells expressing PREP1-GFP wild-type or PREP1-GFP mutants were fixed in PFA for 10 min at room temperature. After incubation with primary antibodies (Anti-PBX1 (Cell Signaling #4342), Anti-GFP (AbCam #Ab290)), appropriate PLA probes (secondary antibodies conjugated with oligonucleotides) were added to the samples. Ligation of the oligonucleotide probes that were in proximity (less than 40 nm) was then performed, after which fluorescently labeled oligonucleotides were added together with a DNA polymerase to generate a signal detectable by a fluorescence microscope. In order to find the number of foci inside and outside nuclei, a custom Fiji 84 plugin was developed. The plugin identifies foci using the ImageJ's Find Maxima 85 algorithm on the red channel image after background removal (using the rolling ball algorithm 86 ). The nuclei are recognized on the DAPI channel image, enhanced with a gaussian blur filter, applying the Huang's threshold method. Then, the plugin counts the foci Scientific RepoRtS | (2020) 10:16809 | https://doi.org/10.1038/s41598-020-74032-w www.nature.com/scientificreports/ inside and outside the nuclei. The values reported represent the count of two independent experiments and for each condition we quantified 125-175 cells.

Confocal microscopy acquisition and images analysis.
In order to evaluate the nuclear localization of GFP-tagged PREP1 a Fiji 84 plugin was developed. The plugin finds the nuclei on the DAPI channel using the Otsu's thresholding algorithm 87 , after background removal 88 and the application of a gaussian filter. Then, for each nucleus found the mean intensity of the GFP signal is extracted after background removal. For each cell, the cytoplasmic mean intensity is extracted from a drawn by hand region. The plugin reports the ratio between the nuclear and the cytoplasmic signal for each cell. To mark the cell perimeter, we stained the cytoskeleton using Alexa Fluor 647 phalloidin (Thermo Fisher Scientific). Type 2 (homoscedastic) T-TEST was performed using one-tail function, to determine if there was a significant difference between each mutant binding datasets against the wild-types binding datasets. We assigned * for values below 0.05 (90% significance), ** for values below 0.025 (95% significance), *** for values below 0.01 (significance 98%), **** for values below 0.005 (99% significance), and ***** for values below 0.001 (> 99% significance).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.