Transcription factors (TFs) control cell fates by precisely orchestrating gene expression. However, how individual TFs promote transcriptional diversity remains unclear. Here, we use the Hox TF Ultrabithorax (Ubx) as a model to explore how a single TF specifies multiple cell types. Using proximity-dependent Biotin IDentification in Drosophila, we identify Ubx interactomes in three embryonic tissues. We find that Ubx interacts with largely non-overlapping sets of proteins with few having tissue-specific RNA expression. Instead most interactors are active in many cell types, controlling gene expression from chromatin regulation to the initiation of translation. Genetic interaction assays in vivo confirm that they act strictly lineage- and process-specific. Thus, functional specificity of Ubx seems to play out at several regulatory levels and to result from the controlled restriction of the interaction potential by the cellular environment. Thereby, it challenges long-standing assumptions such as differential RNA expression as determinant for protein complexes.
The development of living organisms is the result of a fine-tuned spatial and temporal expression of genes, which is driven by transcription factors (TFs). Many TFs are expressed in several cell types, and control different transcriptional programs depending on the cell context1,2,3,4. However, how multi-lineage TFs can function in such specific manner in different environments remains elusive. Most of the efforts to understand the function and specificity of TFs was so far focused on their interaction with regulatory proteins at cis-regulatory modules, so-called enhancers and promoters5,6,7,8. However, TFs do not only interact with other TFs but with a variety of proteins including chromatin associated proteins, histone modifiers, factors of the general transcriptional machinery or mRNA regulatory proteins9,10,11,12,13. Hence, it is thought that TFs promote cell type diversity by assembling protein interaction networks consisting of different types of proteins in a cell-type-specific manner6,14,15. However, as suitable approaches have been unavailable so far, this assumption still awaits approval.
One prominent example of broadly expressed TFs is the conserved class of Hox proteins, which are active in many embryonic and adult tissues along the anterior-posterior (A/P) axis of animals16. Although Hox TFs recognize similar DNA sequences in vitro due to a highly conserved DNA-binding domain, the homeodomain (HD)17, they control gene expression programs in a highly context-dependent manner in vivo via the interaction with other proteins2,18,19. In particular, the interaction with the three-amino acids loop extension (TALE) family of HD-containing TFs has been extensively studied, which includes the Drosophila Extradenticle (Exd) and the vertebrate Pbx1-4 proteins20. These proteins cooperatively bind DNA with Hox TFs thereby increasing their regulatory specificity20,21,22,23. Hox-TALE interactions are mostly mediated via a short hexapeptide (HX) motif, which lies upstream of the Hox HD24, and alternatively via the UbdA domain, a protein motif found downstream of the HD in the two Hox TFs Ultrabithorax (Ubx) and Abdominal-A (Abd-A)25,26. Although TALE TFs are important for Hox function, they can only partially explain how Hox TFs can function in a context-specific manner in vivo, in particular as they are expressed in many different cell types themselves27. Thus, Hox proteins are an ideal model to tackle the question of how TFs orchestrate precise transcriptional programs in different cellular contexts.
In order to reveal the regulatory complexes that drive the multi-faceted outputs of TFs, unbiased methods are required to identify stable and transient TF interaction networks in vivo. Proximity-labelling of proteins coupled with mass spectrometry (MS) offers a systematic analysis of spatially restricted proteomes, providing a comprehensive understanding of cellular functions in different contexts28,29,30,31,32. The two most prominent proximity-labelling methods are Ascorbate peroxidase proximity labelling (APEX) and proximity-dependent biotin identification (BioID), which are both based on biotinylation of adjacent proteins followed by affinity-based purification29,32,33. Thus, these two methods allow capturing and identifying the neighbourhood proteins in the context of a living cell. In contrast to APEX, BioID, whose activity depends on biotin, does not alter cell physiology29,34. In this system, the close-proximity biotinylation is driven by a mutant version of the biotin-ligase BirA originating from Escherichia coli. This mutant version called BirA* (R118G) converts biotin into the reactive compound 5′-bioAMP but loses its affinity for this substrate. BioAMP is then released and biotinylates proteins on lysine residue in a 10 nm range29,34,35. BioID has been applied in multiple systems ranging from cell culture to tumour xenografts in mice29,36,37.
Here, we combine BioID with the GAL4-UAS system38, which permits the expression of the BirA* fusion protein in the cell type of choice and allows to capture lineage-specific interactomes. We use the Hox TF Ubx as a model, as it specifies distinct developmental programs in different tissue types in a stage-dependent manner2. For our comparative analysis of Ubx interactomes, we focus on the mesodermal, neural and neuroectodermal lineages. Our results demonstrate that targeted BioID is highly efficient in isolating lineage-specific Ubx partners at the subcellular level in vivo, and reveal that Ubx interactomes in the different lineages were largely non-overlapping. Interestingly, we find that Ubx interacts mostly in a lineage-specific manner with ubiquitously expressed proteins involved in general transcriptional regulation, like chromatin remodelling proteins or RNA processing factors, and only with a few of lineage-restricted factors. Even more important, our genetic interaction analyses reveal that, in vivo, the identified interactions acted lineage- and process-specifically. It demonstrates that functional specificity of Ubx is realized at multiple regulatory levels and is not only a consequence of different Ubx-protein combinations recognizing distinct sequence codes written in enhancers and promoters. Thus, TFs seem to act as versatile protein platforms, which function beyond the cis-regulatory level to ensure robust yet flexible gene expression programs critical for the development and maintenance of cell and tissue types.
Design and validation of BioID in a Drosophila cell system
To identify lineage-specific interaction partners of the Hox TF Ubx in vivo, we combined BioID with the GAL4-UAS system38. To this end, we fused the N-terminal part of Ubx (isoform Ia) to UAS-myc-BirA* (mB*UbxWT) (see Methods) (Fig. 1a). In addition, we also generated a fusion of BirA* and Ubx containing a single mutation (N51A) in the DNA-binding domain, the homeodomain (mB*UbxN51A). This mutation prevents the recognition and binding of Ubx to DNA, which we confirmed by electrophoretic mobility shift assay (EMSA) (Supplementary Fig. 1a). We reasoned that a comparison of UbxWT and UbxN51A interactomes would allow the discrimination of interactions important for TF binding to the chromatin from interactions established in the nucleoplasm (Fig. 1b). As a general control, BirA* was fused to GFP and a nuclear localisation sequence (mB*nlsGFP). In order to verify the suitability of BioID for identifying Ubx interaction partners, we tested the system in Drosophila S2R+ cells (see Supplementary Note 1, Supplementary Fig. 1).
Taken together, these results demonstrated that BioID is an efficient and specific method to purify interaction partners of TFs in a Drosophila cell-based system.
Establishment of targeted BioID in Drosophila embryos
Having confirmed the efficiency of BirA*Ubx fusion proteins in biotinylating close-proximity proteins in cells, we next tested the technique in embryos and generated transgenic flies carrying the mB*UbxWT, mB*UbxN51A and mB*nlsGFP fusions. First, we verified the functionality of the proteins in living animal by analysing the well-described homeotic transformation induced by aberrant Hox expression39 and used the transformation of segmental denticle belt patterns in first instar larvae as a read-out. In line with previous reports1, ubiquitous expression of wild-type Ubx (mB*UbxWT) induced a switch of thoracic segment identity towards the identity of abdominal segments but not mB*UbxN51A (Fig. 1d). These results verified that the mB* fusion proteins are functional in Drosophila.
To resolve cell type-specific Ubx interactive networks, we selected the mesodermal and neural tissues due to the well-described function of Ubx in both lineages2,40. Using the pan-mesodermal driver twist-GAL4 (twi-GAL4) and the pan-neural driver elav-GAL4, we expressed the mB* fusion proteins in stage 10–13 embryos (5–8 h after egg lay AEL) (Fig. 1b). We selected this time frame as Ubx is normally expressed and active in these tissues during these stages2. To control for any discrepancies in lineage-specific timing, we also mapped the Ubx interactome in the early nervous system (stage 9–11 embryos, 2.5–5 h AEL) using the neuroectodermal driver scabrous-GAL4 (sca-GAL4) (Fig. 1b).
We first evaluated the tissue-specific expression of the mB* fusion proteins and their activity by immunofluorescence. This analysis revealed a robust and specific expression and biotinylation efficiency of the BirA* fusion proteins (Fig. 1c, Supplementary Figs. 2b–d, 3a, b). In contrast, we did not detect any biotinylation in wild-type embryos (Supplementary Fig. 2a). These results demonstrated that the yeast-rich food diet used for the experiments was sufficient for BirA* dependent protein biotinylation in Drosophila embryos, rendering biotin supplementation unnecessary in vivo. Detailed analysis of BirA* fusion protein expression and biotinylation confirmed the specificity of the system, as both BirA* expression and biotinylation were exclusively detected in the lineage and at the time-points controlled by the different drivers (Fig. 1c, Supplementary Figs. 2b–d). Finally, western blot analysis revealed an efficient streptavidin affinity purification of biotinylated proteins using nuclear extracts from twi>mB*UbxWT, twi>mB*UbxN51A and twi>mB*nlsGFP embryos (Fig. 1e).
In sum, these results showed that the targeted BioID method is efficient and highly specific in embryos and thus ideally suited to study spatiotemporal interactomes of Ubx.
Exploring targeted BioID in Drosophila embryos
We subsequently performed mass spectrometry analysis using the streptavidin affinity purified fraction of nuclear extracts from embryos expressing the BirA* fusion proteins (mB*UbxWT, mB*UbxN51A and mB*nlsGFP) under the control of the twi-, elav- and sca-GAL4 drivers. The experiments had high similarities across independent biological replicates for both the neural and mesodermal BioID (Pearson correlation, n = 4; twi-BioID r > 0.7; elav-BioID r > 0.85) (Fig. 1f, Supplementary Figs. 3c, e). In contrast, replicates of the neuroectodermal BioID were more variable (r > 0.58, Supplementary Fig. 3d), which may be a consequence of the broad activity of the sca-GAL4 driver in a mixed cell population consisting of ectodermal and neural progenitor cells41. The origin of the GAL4 driver also controlled the amount of proteins detected by BioID. For example, the total number of proteins quantified was between 142 and 244 for the mesoderm (Fig. 1f, Supplementary Fig. 3e), between 70 and 131 for the neural system and 242–593 for the neuroectoderm (Supplementary Fig. 3c, d). This discrepancy is likely due to the different activities of the elav- and twi-GAL4 drivers2, resulting in a shorter biotinylation period in the elav-BioID sample (Supplementary Fig. 2b, c), while the sca-GAL4 targets more cells in comparison to the twi- and elav-GAL4 drivers (Supplementary Figs. 2b, d, 3b), allowing more proteins to be biotinylated.
In order to identify features characterizing the different Ubx BioID-interactomes, we performed principal component analysis (PCA) as well as heat map representations on all of the proteins found in UbxWT replicates from the different tissues. We specifically used the proteins of the UbxWT datasets, as they included Ubx interactions normally established in the different tissues. Both approaches grouped replicates of Ubx BioID-interactomes based on the lineage identity (Fig. 2a, Supplementary Fig. 4a), showing that the lineage context dictated the interaction partners of Ubx. We next compared the different datasets using Pearson correlation coefficient analysis. We found that the mesodermal (twi-BioID) and neural (elav-BioID) Ubx BioID-interactomes were the most similar datasets (r = 0.66 for twi-/elav-BioID), while the neuroectodermal (sca-BioID) and mesodermal Ubx BioID-interactomes showed the greatest differences (r = 0.245 for sca-/twi-BioID) (Supplementary Fig. 4b). This result highlighted once more the importance of the lineage context but also showed that Ubx interactions are dependent on the developmental stage.
In sum, targeted BioID allowed us to identify lineage- and stage-specific Ubx interactomes, which we assumed to be at the basis of Ubx’s ability to orchestrate functional diversity during development by triggering distinct and highly defined gene expression programs in a spatial and temporal manner.
Characterization of lineage-specific Ubx BioID-interactomes
We next analysed the proteins that were found in the vicinity of Ubx in the mesodermal, neural and neuroectodermal lineages. To this end we compared proteins which were significantly enriched in the UbxWT samples by normalising them to the GFP control and selected the ones enriched in 2 out of 4 replicates (see Methods, Supplementary Data 4–33 and Supplementary Table 1). This analysis resulted in the recovery of 60 proteins specific for the mesoderm, 19 for the nervous system and 78 for the neuroectoderm (Fig. 2b). Intriguingly, the vast majority of proteins was unique for each Ubx BioID-interactome (135/145), while only 10 were found in more than one BioID-interactome with two of them, Ubx itself and Brahma associated protein 111kD (Bap111, Dalao) a component of the Brahma nucleosome remodelling complex, identified as Ubx close-proximity partners in all tissues (Fig. 2b, d, Supplementary Data 34). This result raised the question whether these differences in Ubx interactomes are a consequence of the interactors being differentially expressed in the individual cell types. To test this, we analysed the expression of UbxWT close-proximity partners using lineage- and stage-specific transcriptome data2, and found that the majority of Ubx BioID partners were equally expressed in the mesoderm and nervous system (Supplementary Data 35). Only a few BioID hits showed tissue-specific expression, which included two out of 60 proteins in the mesoderm (Tinman, Tin and Brick a brac 2, Bab2) and two out of 19 proteins in the neural system (TfAP-2 and Grainy-head Grh). This result demonstrated that although most of the Ubx interactors were broadly expressed, Ubx was able to interact with these proteins in a highly specific manner in the different cellular contexts.
As TF-TF pairs are central to achieve gene expression specificity6,42,43,44, we next asked whether TFs were the predominant class of proteins interacting with Ubx in the different tissue lineages. By clustering Ubx interactors based on their molecular function, we found that only a minor fraction encoded TFs (16% for mesodermal BioID-interactome), while the majority represented proteins controlling gene expression at other regulatory layers. Indeed, many of the lineage-specific Ubx interactors are known to control co- or post-transcriptional events like RNA processing and translation (33% for mesodermal BioID-interactome) or processes that prepare the chromatin landscape for transcription, in particular chromatin remodelling events (32% for mesodermal BioID-interactome) (Figs. 2c, 3a). Consistently, STRING-based network reconstruction performed using the mesodermal Ubx close-proximity partners as input uncovered two major inter-connected grids, one related to mRNA regulation and ribonucleoprotein functions and the other one related to chromatin regulation, which included the few TFs identified as Ubx interactors (Fig. 3a, Supplementary Fig. 5).
In order to tackle how these different functions are integrated by Ubx in the nuclear environment, we analysed the compartment in which Ubx preferred to interact with its partners. To this end, we made use of our experimental set-up and identified those proteins found in close proximity to UbxN51A (Fig. 1b), the version of Ubx unable to bind DNA (N51A/GFP). We overlapped the UbxWT and UbxN51A BioID-interactomes and defined three protein populations: proteins interacting with Ubx preferentially on the chromatin (UbxWT enriched), proteins found in close proximity to Ubx in the nucleoplasm but also on the chromatin (overlap UbxWT and UbxN51A) and proteins interacting with Ubx in the nucleoplasm only (UbxN51A enriched) (Fig. 3b, Supplementary Fig. 5, Supplementary Data 36–38). Consistently, GO terms analysis revealed that proteins interacting with Ubx on the chromatin strongly controlled chromatin-related processes in particular ATP-dependent chromatin remodelling. In contrast, proteins of the nucleoplasm/chromatin fraction preferentially regulated general processes of transcription like transcription start site selection or transcriptional initiation and post-transcriptional events like mRNA 3′-end processing or splicing. Finally, proteins of the nucleoplasm fraction were almost exclusively associated with splicing-related functions. Lineage-specific GO terms were strongly over-represented only among the chromatin population (Fig. 3b, Supplementary Fig. 5).
Together, these results demonstrated that Ubx interacted with different components of protein complexes regulating general aspects of gene expression in a lineage-specific manner. Thus, it seems that Ubx controls gene expression at multiple levels, and that the regulatory events happening at enhancers and promoters represent only one of the many layers conferring specificity to Hox TFs.
Comprehensive validation of Ubx BioID-interactomes
Having identified lineage-specific Ubx close-proximity partners by a proteomics-based approach, we next wanted to elucidate whether these proteins interacted with Ubx in a complex. We focused our analysis on proteins identified in the mesoderm, as we have recently characterized Ubx’s function in this tissue at the chromatin level2. We first performed co-immunoprecipitation (co-IP) of Ubx close-proximity partners in vivo. To this end, we used embryos containing endogenously GFP-tagged Ubx gene and studied the interaction of GFP-Ubx with BioID candidates, for which antibodies were available. This included the transcriptional co-repressor C-terminal binding protein (CtBP), Combgap (Cg), a Zn finger TF binding to Polycomb response elements, the Zn finger TF Zelda (Zld), a known pioneer factor and the mesoderm-specific TF Tinman (Tin), a master gene of cardiac development. All four proteins were precipitated in Drosophila embryos by GFP-Ubx, which was also the case for the known Ubx interactor Motif 1 binding protein (M1BP) (Fig. 3c, Supplementary Fig. 6a, Supplementary Data 39). In contrast, we could not detect an interaction between Ubx-GFP and Polycomb (Pc) recovered only by the sca-BioID and Tubulin (Tub), which was not recovered by any BioID experiment (Fig. 3c). To further characterize Ubx interactions in the mesoderm, we studied expression of Ubx, CtBP, Zld, Tin and Cg using antibody stainings as well as Brahma (Brm), the ATPase subunit of the Brahma chromatin remodelling complex, by means of a GFP fusion line in stage 10–13 embryos. As all these proteins except Tin are expressed in more than one tissue, we specifically labelled the mesoderm using the twist-INTACT transgene. Animals carrying this construct have their mesodermal nuclei biotin-labelled by the co-expressed wild-type BirA. Notably, we observed a co-localization of all five proteins with Ubx in mesodermal cells. In particular, they were co-expressed in cells of the somatic and visceral mesoderm (Figs. 3d–k, Supplementary Fig. 6b). These results demonstrated that the TFs CtBP, Cg, Zld and Tin interacted with Ubx in Drosophila embryos, and showed that BioID is efficient in capturing transient interactions between TF pairs in vivo.
Due to the restricted availability of antibodies, only a few BioID-identified Ubx close-proximity partners could be studied in vivo. To comprehensively validate the BioID-interactomes, we thus performed co-IP experiments in cellulo. To this end, we tested 17 BioID candidates identified in the mesoderm by overexpressing HA- or V5-tagged versions of these proteins together with nlsGFP, GFP-UbxWT or GFP-UbxN51A in Drosophila S2R+ cells. This list included the four interactors, CtBP, Cg, Zld and Tin, which we had already confirmed by in vivo co-IP, as well as the basic-helix-loop-helix TF Cropped (Crp), a factor important for muscle morphogenesis, Brahma (Brm), the ATPase subunit of the Brahma chromatin remodelling complex, Bicaudal (Bic), a protein involved in mRNA and protein localization, and a group of proteins with roles in mRNA processing, Splicing factor 1 (SF1) and Splicing factor 2 (SF2), Srp54, Cwc25, the small ribonucleoprotein particle U1 subunit 70 K (snRNPU1-70K), Small ribonucleoprotein particle protein (Smb), Scaffold attachment factor B (Saf-B), Bx-42, a splicing component that acts in the Notch pathway, SRm160, a protein important for pre-mRNA splicing and 3′ end formation, and Nucampholin (Ncm). Fifteen out of the 17 proteins were pulled down by GFP-UbxWT and/or GFP-UbxN51A in cellulo (Supplementary Fig. 6c–e, Supplementary Data 39) and the known Ubx cofactor Exd (Supplementary Fig. 6f). Notably, CtBP, Tin, Zld and Cg, which interacted with Ubx preferentially on the chromatin in the BioID analysis, were pulled down more efficiently by UbxWT in comparison to UbxN51A. In contrast, Brm and Bic were immunoprecipitated at equal levels, while the splicing-related factors SF1, Srp54, Cwc25 and SF2 were pulled down more efficiently in co-IPs overexpressing UbxN51A (Supplementary Fig. 6c, Supplementary Data 39). Having confirmed Ubx interactions in the mesoderm, we also tested two Ubx close-proximity partners identified in the neural tissue, the TFs Grh and TfAP-2 (Supplementary Fig. 6g, h). Both proteins interacted with Ubx in co-IP experiments in cellulo, again stronger with the GFP-UbxWT protein (Supplementary Fig. 6h, Supplementary Data 39).
In sum, these experiments validated many of the close-proximity partners identified by BioID. It also revealed that, in contrast to mRNA-processing factors, TFs and chromatin remodelling proteins preferred to interact with the DNA-binding proficient version of Ubx, independently of cellular context. Finally, these experiments underlined again the importance of the cellular environment, as the high interaction potential of Ubx was limited to only a few specific ones in the individual lineages in vivo.
Specificity from interaction with lineage-restricted TFs
One question arising from this study is how Ubx can interact with different sets of functionally related and ubiquitously expressed proteins in diverse lineages. One possible explanation is the interaction of Ubx with lineage-restricted factors, which could adjust the action of Ubx to the cellular environment. We had identified a few Ubx interactors that were lineage-specifically expressed, and selected two TFs, Tin and Grh, which were enriched in the chromatin fraction of the mesodermal and neural Ubx BioID-interactomes to study their role in tissue development.
We first tested whether Tin and Grh bound the same chromatin regions as Ubx in the respective tissues. To this end, we compared genome-wide binding profiles of Tin45, a TF exclusively active in the mesoderm, and Grh46, a TF expressed in ectodermal and neural cells, to Ubx chromatin interactions2. This analysis uncovered 251 regions bound by Ubx and Tin in close vicinity in the mesodermal lineage and 401 regions co-bound by Ubx and Grh in the neural lineage among a large number of distinct binding events for all three TFs (Fig. 4a, Supplementary Table 2). Regions bound by Ubx and Tin in the mesoderm and Ubx and Grh in the nervous system, which occurred preferentially at promoters (Fig. 4b), were almost exclusive (95%, Fig. 4c). Importantly, the enhancer logic of the bound regions seemed to be different as well, as the motifs of Ubx and its known cofactor Exd were highly enriched in both Ubx-Tin and Ubx-Grh regions, while the motif of the pioneer TF Zld, a partner identified in mesodermal- and neural-BioID, was enriched exclusively among the Ubx-Tin bound chromatin sites (Fig. 4d).
Lineage-specific differences at the enhancer/promoter levels were also reflected in the genes associated with the Ubx-Tin and Ubx-Grh co-bound regions, as GO terms related to mesoderm development were over-represented among the genes bound by Ubx and Tin, while GO terms of genes bound by Ubx and Grh were associated with several tissue lineages (Fig. 4a). The latter could be due to Ubx’s ability to repress the expression of alternative fate genes thereby realizing lineage development2. Consistently, we found that 60% of the genes targeted by Ubx and Grh in the nervous system were inactive, while the majority of genes (80%) bound by Ubx and Tin in the mesoderm were expressed (Fig. 4e). GO terms specific for the respective lineage were strongly enriched only among the active but not inactive genes bound by Ubx/Tin or Ubx/Grh (Supplementary Table 2c, d), suggesting that Ubx in combination with lineage-restricted TFs induces lineage-specific gene programs. Furthermore, the Ubx/Tin co-bound genes were more specifically related to dorsal heart vessel and cardiac cell fate commitment compared to genes bound independently by Tin and Ubx (Supplementary Table 2a, b). This suggested that the Ubx/Tin pair is involved in defining the cardiac cell fate, thereby conferring specificity to Ubx in mesoderm development. To provide further evidence that Ubx controls the expression of genes targeted in the respective tissues, we made use of our recently published resource that identified transcriptional profiles in the mesoderm when Ubx protein was tissue-specifically degraded2. We found the expression of 74 out of the 367 (20%) genes bound by Ubx and Tin significantly changed in the mesoderm in the absence of Ubx (Source Data file), which included the known Tin target gene bagpipe (bap)47 and Ubx target gene decapentaplegic (dpp)48.
We subsequently explored the functional interplay between Ubx and Tin in more detail using dpp as a model48, as we identified a Tin and Ubx ChIP peak in the well-characterized visceral mesoderm-specific dpp enhancer49,50, dpp674 (Fig. 5a). Notably, dpp RNA expression was lost in the visceral mesoderm in the absence of Ubx, which was also the case in tin homozygous mutants (Fig. 5b, e, f). As the visceral mesoderm is not specified in the absence of tin51, we analysed dpp expression in heterozygous tin and Ubx double mutants. dpp transcript levels were significantly reduced in heterozygous double mutants (Fig. 5d–h), showing that Ubx and Tin functions are required for the regulation of dpp expression. As our analysis showed that the dpp enhancer is bound by Ubx and Tin, we assumed that Ubx and Tin function in a combinatorial manner to activate dpp transcription. To support this hypothesis, we performed functional assays in Drosophila S2R+ cells by transiently expressing Tin, Ubx and the dpp674 enhancer, which controlled luciferase expression48. This analysis revealed that Ubx protein alone efficiently induced reporter gene expression even at low levels, while Tin was able to do so only at high protein concentrations (Fig. 5i). Co-expression of both proteins substantially increased luciferase expression driven by the dpp674 enhancer or by an artificial enhancer consisting of adjacent Ubx and Tin binding sites (Fig. 5i, Supplementary Fig. 7a). This effect was dependent on the homeodomains of Ubx and Tin, as reporter gene activation was not increased by Ubx and Tin protein versions unable to bind DNA (UbxN51A or TinN51A) (Fig. 5i, Supplementary Fig. 7b). In line, EMSA experiments confirmed the interaction of Ubx and Tin with the dpp enhancer, both independently and in a complex (Supplementary Fig. 7e). In sum, these results showed that Ubx and Tin functionally interacted on the dpp enhancer to activate gene expression.
Ubx interacts with its known cofactor Exd via two protein motifs, the hexapeptide (HX) and the UbdA domain to regulate target genes24,25,26. Thus, we asked whether one of these domains was also required for the Ubx-Tin interaction. GST pull-down experiments using purified full-length Ubx and Tin proteins revealed that Ubx directly interacted with Tin, even stronger than with Exd (Fig. 5n). To elucidate the requirements for this interaction, we generated truncated versions of Ubx (Fig. 5j). We found that only the full-length Ubx protein was highly efficient in pulling down Tin, the individual domains pulled down Tin only to a lesser extent (Fig. 5k, l), which suggested that a combination of domains are required for robust and functional interaction between Ubx and Tin. In contrast, the HX motif realized to a large extent the interaction of Ubx and Exd (Fig. 5m, Supplementary Fig. 7d), as previously described52. Notably, the interaction between Ubx and Tin was not influenced by the N51A amino acid exchange in the Ubx homeodomain (Fig. 5n, Supplementary Fig. 7c). These results showed that the interaction of Ubx with Tin, as with Exd, can occur independently of DNA binding. In contrast, the ability to bind DNA was required for functional cooperation of both TFs in vivo (Fig. 5i, Supplementary Fig. 7b) and enhanced the interaction in cells (Supplementary Fig. 6c, Supplementary Data 39).
In sum, these results showed that Ubx cooperates with the mesodermal master regulator Tin to promote lineage development. Furthermore, our results showed that Ubx utilizes different protein domains to interact with other TFs, which we assume to be the basis of Ubx’s ability to assemble cell-type-specific (co-)transcriptional networks that function at various levels of gene expression.
Lineage-specific functional cooperation with diverse partners
In a final step, we sought to provide evidence that the interaction of Ubx with proteins acting at different levels of gene expression were of functional relevance and necessary for lineage development. In addition, we wanted to test whether the specificity of interactomes identified by BioID is also reflected at the functional level. We focused our analysis on interactors identified in the mesoderm with one exception Brm, a BioID-identified interactor of Ubx in the mesoderm and nervous system (Fig. 2d). We set genetic interaction assays between Ubx and tin, brm, Srp54 or snRNPU1-70K by crossing the tin346, brm2, Srp54DG02112 and snRNPU1-70K02107 null alleles into the Ubx1 mutant background. Mesodermal development was studied in single as well as double heterozygous (and homozygous) stage 16 mutant embryos by characterizing the muscle morphology using Tropomyosin 1 (Tm1) (Fig. 6). We found embryos heterozygous for individual mutations to be indistinguishable from w1118 control embryos (Fig. 6a, c–h)2,53, showing that a reduction of the dose of these genes did not affect the development of the mesodermal lineage. In contrast, prominent and distinct phenotypes were detected in the muscle lineage in double heterozygous mutants. While lateral muscles but not the ventral oblique muscles (VO4-VO6) were either lost or malformed in the first two abdominal segments (A1, A2) in Ubx1,tin346, brm2,Ubx1 and snRNPU1-70K02107;Ubx1 heterozygous mutants (Fig. 6a, i–j, l), an extra transversal muscle was found in Srp54DG02112;Ubx1 double heterozygous mutants (Fig. 6k). Moreover, these phenotypes were different from those observed in embryos carrying individual homozygous null alleles. For example, Ubx1;Ubx1 mutants displayed homeotic transformation of A1 and A2 muscle pattern, including the absence of ventral oblique muscles (VO4-VO6) characteristic for thoracic segments (Fig. 6m)54, a phenotype not found in any of the double heterozygous mutants (Fig. 6i–l). In line, homozygous brm2, snRNPU1-70K02107 as well as Srp54DG02112 mutants had thinned transversal muscles (Fig. 6o–q), which was not the case in heterozygous combinations with the Ubx1 allele (Fig. 6j–l). Importantly, we did not detect a phenotype in the neural lineage for the double heterozygous mutants of Ubx and the mesoderm-specific interactors (Tin, Srp54, snRNPU1-70K), as neither the number of neuroblasts (NBs) within the ventral nerve chord (VNC) nor the innervation of the ventral-lateral muscle 1 (VL1) of abdominal segments, both affected in Ubx mutant embryos, were changed in comparison to control animals (Supplementary Figs. 8–10). In contrast, double heterozygous brm2,Ubx1 mutant embryos were characterized by additional neuroblasts in the A1 segment (Supplementary Figs. 9 and 10), which is consistent with our data on Ubx interacting with Brm in the mesodermal and neural lineages. Finally, we also studied the interaction between Ubx and its BioID-identified neural partner Grh, as the two proteins co-localised in vivo and interacted by co-IP in cells (Supplementary Fig. 6g, h). Using Dpn stainings as read-out40,55,56,57, we found that single Ubx1 and grhIM homozygous mutants displayed supernumerary NBs in the A1 segment (Ubx, +4NBs) and in all abdominal segments (Grh), while single heterozygous mutants did not show a significant change in NBs number (Supplementary Fig. 9). In contrast, grhIM;Ubx1 double heterozygous mutant embryos exhibited additional NBs in A1 and A2 segments, revealing a functional cooperation between Ubx and Grh during programmed cell death56,58,59. This interaction is of functional importance only in the neural lineage, as the muscle pattern was unaltered in grhIM;Ubx1 double heterozygous mutant embryos (Supplementary Fig. 9h–j).
In sum, these results demonstrated that the Hox TF Ubx functions not only via the interaction with other TFs at cis-regulatory modules but uses a whole battery of proteins acting at different levels of gene expression. Importantly, most of the interactors are commonly expressed, nonetheless the interactions with Ubx and the functional outputs are highly lineage- and factor-specific, enabling Ubx to control different aspects of development in a precise manner in diverse lineages.
Proteins interact with a multitude of partners in a highly specific yet dynamic and context-dependent manner, which is detrimental for a cell to adopt and maintain its appropriate fate. So far it has been challenging to capture these diverse and transient interactions due to the lack of sensitive-enough methods, which unbiasedly identify factors in close proximity in different cellular contexts in the living organism. To fill this gap, we have designed a targeted proximity proteomics approach by combining BioID34 and the GAL4-UAS system38. We selected the Hox TF Ubx and the mesodermal, neural and neuroectodermal lineages as a model to verify the functionality of the system. Using this approach, which requires protein overexpression, we identified Ubx interactomes specific to each lineage. By comparing the Ubx interactomes identified by BioID to proteins known to physically interact with Ubx60, we found only a small overlap (Supplementary Fig. 11). This is in line with recent studies showing that different methods capture variable types of protein–protein interactions, which are all biologically relevant32. Our data support this notion, especially as we have validated a substantial number of Ubx interactions by co-IP. Analysing the proteins identified by other methods in more detail revealed that they were enriched for chromatin interacting proteins, in particular TFs. This bias is, however, not a result of Ubx’s preference to interact primarily with other TFs but intrinsic to the dataset, as it is largely based on Bi-molecular fluorescence complementation (BiFC) screens, which used pre-selected TFs to test their ability to interact with Ubx and other Hox proteins52,61. Thus, targeted BioID is a valuable and powerful method and ideally complements other approaches, as it captures dynamic, weak and specific interactions in vivo in an unbiased manner.
Our study, which analysed independent tissue lineages of comparable developmental stages, revealed that Ubx interacted with a largely non-overlapping set of proteins in the different cellular contexts. In contrast, Ubx interacted equally well with all the proteins identified by BioID in cellulo. These results demonstrated first, that the Hox protein Ubx has an intrinsically high interaction potential, which has been noted before52,61,62. Secondly, this high interaction potential is restricted to a few specific ones in vivo, where the cellular context dictates the type of interactions. Importantly, our genetic interaction studies demonstrated that these context-specific interactions are of functional importance in vivo and indeed active only in specific lineages. One question arising from this behaviour is how interaction specificity, which allows a precise matching of Hox function and activity to the cell type and developmental stage, is achieved. It is known that Ubx protein, like many other TFs, harbours intrinsically disordered domains that are important for selecting interacting partners63,64,65. Thus, the few lineage-restricted Ubx interactors identified in this study, in particular Tin or Grh, could be responsible for Ubx’s differential interaction potential by binding to these intrinsically disordered domains. They could enforce lineage-specific protein conformations that can only be bound by a subset of the many Ubx interactors. In line, we found that the interaction of Ubx with Tin required the full-length Ubx protein, and was not driven by previously characterized structured domains like the homeodomain or HX motif. In addition, it is known that intrinsically disordered domains are the predominant sites of post-translational modifications66. They can have a pronounced effect on the structural and physicochemical properties of a protein, modulating the composition of protein complexes. Thus, it is tempting to speculate that the different interactomes assembled by Ubx in the mesodermal and neural lineages are also dependent on specific post-translational modifications, which are cell type- and stage-specifically written on intrinsically disordered domains of Ubx. Consistently, it is now more and more realized that Hox TFs are heavily modified at the post-translational level67,68. In future, it will be crucial to characterize Ubx-Tin and Ubx-Grh complexes on the structural-functional level and to study cell type-specific post-translational modifications of Ubx in vivo to resolve the specificity problem intrinsic to Hox TFs.
Another striking finding of our study is that although Ubx interactions were distinct in the different tissue types, most of the proteins were not lineage-specifically expressed but active in many cell types. Indeed, the majority of Ubx interactors encoded ubiquitously expressed proteins, which are part of complexes controlling general aspects of gene expression. This included regulators of the chromatin landscape with an emphasis on chromatin remodelling components, proteins of the Polycomb complex, and major regulators of mRNA processing and protein translation. It is well-described that mRNA processing is a co-transcriptional process69,70. Moreover, the chromatin environment affects transcription at different levels by modulating enhancer accessibility71 or the speed rate of the RNA-polymerase II through gene bodies72. Similarly, recent studies revealed that chromatin regulators, such as components of the remodelling complex SWI/SNF, interact with snRNP proteins73. Our proteomics and functional data now showed that all these proteins, which act at different control levels of gene expression programs, converge on the Hox TF Ubx (Fig. 7). Thus, Ubx seems to act as a protein platform that integrates in a highly flexible manner multiple regulatory inputs, possibly via its intrinsically disordered domains, to realize the many different yet specific outputs. Consistent with that view, it has been shown recently that Ubx forms dynamic sub-nuclear protein clusters, so-called micro-environments, that promote gene expression in vivo74. In that respect, the currently discussed phase separation model for transcriptional regulation is of particular interest75,76,77. It represents a concentration of regulatory proteins in active nuclear sub-domains driven by weak and dynamic interactions, in defined cellular condensates that we seem to have captured in vivo (Fig. 7). In the future, it will be highly relevant to relate the dynamic Ubx transcriptional hubs with the lineage-specific interaction networks identified in this study to elucidate how such multivalent interactions control precise gene expression programs, which realize and maintain specific cell fates.
Fly line and materials
For the BioID, nlsGFP, UbxWT and UbxN51A (site directed mutagenesis) were generated, cloned in pUAST-attB-myc-BirA*-GGSGG- (BioID cloned from #35700, Addgene) and the constructs were integrated stably on the third chromosome using the Bestgene service. The subsequent UAS-BioID lines were crossed in the twist-GAL4 (twi), elav-GAL4, scabrous-GAL4 (sca) background to generate elav-GAL4;;UAS-mB*nlsGFP, sca- and twi-GAL4;UAS-mB*nlsGFP stable lines. For UAS-mB*UbxWT and UbxN51A, males were crossed with female containing driver-GAL4. Plasmids generated for the study, oligonucleotides and fly lines (generated, generously provided or from Bloomington center) are listed, referenced in Supplementary Table 3 and available upon request. myc-BioID2-MCS was a gift from Kyle Roux (Addgene plasmid # 74223; http://n2t.net/addgene:74223; RRID:Addgene_74223). pcDNA3.1 mycBioID was a gift from Kyle Roux (Addgene plasmid # 35700; http://n2t.net/addgene:35700; RRID:Addgene_35700).
Cell culture and transfection
S2R+ Drosophila cells (generously provided by the Tobias Dick lab (DKFZ Heidelberg), originated from Drosophila Genomics Resource Center) were maintained at 25 °C in Schneider medium supplemented with 10% FCS, 10 U/ml penicillin and 10 µg/ml streptomycin. Cells were simultaneously seeded and transfected with Effectene (Qiagen) according to the manufacturer’s protocol. Cells were harvested in Phosphate Buffered Saline (PBS) and pellets were resuspended in RIPA buffer supplemented with protease inhibitor cocktail (Sigma-Aldrich). For interaction assay, 10 × 106 cells were seeded in 100 mm dishes. Biotin treatment (Sigma) was applied for 24 h after transfection. Cells were harvested in Phosphate Buffered Saline (PBS) after 48 h of transfection and pellets were resuspended with lysis buffer supplemented with protease inhibitor cocktail (Sigma-Aldrich) and 1 mM of DTT. For luciferase assays, cells were co-transfected with pRT-TK-Renilla or pActin-β-Galactosidase plasmid (Promega) for normalization. Cells were harvested 48 h after transfection and luciferase assay for Beta-galactosidase, Renilla and Firefly were analysed using beta-Galactosidase or Dual-luciferase detection kit (Promega). Plasmids are listed in Supplementary Table 3a.
Co-immunoprecipitation of cell and embryos lysate
For co-immunoprecipitation assays, cells were harvested in Phosphate Buffered Saline (PBS) and pellets were resuspended in NP40 buffer (20 mM Tris pH 7.5, 150 mM NaCl, 2 mM EDTA, 1% NP40) and treated with Benzonase (Sigma). GFP-Trap beads (Chromotek) were added to the protein extract, incubated for 2 hours and washed five times with NP40 buffer. For in vivo interaction, overnight collection of embryos was dechorionated, fixed with 3.2% formaldehyde and collected in PBS supplemented with Tween 0.1%. Pellets were resuspended in buffer A (10 mM Hepes pH 7.9, 10 mM KCl, 1.5 mM MgCl2, 0.34 M sucrose, 10% glycerol) and dounced 25–30 times with loose- and 5 times with tight-pestle. Lysates were incubated with 0.1% Triton and centrifugated. Nuclear pellet were then resuspended with buffer B (3 mM EDTA pH 8, 0.2 mM EGTA pH 8), sonicated (Picoruptor, Diagenode), and treated with Benzonase. Four to five milligrams of nuclear lysates were diluted in NP40 buffer (20 mM Tris pH 7.5, 150 mM NaCl, 2 mM EDTA, 1% NP40) and incubated overnight with 40 µl of GFP-Trap beads. Beads were then washed five times with NP40 buffer and all samples were resuspended in Laemmli buffer for immunoblotting analysis. All buffers were supplemented with protease inhibitor cocktail (Sigma), 1 mM of DTT and 0.1 mM PMSF. Input fractions represent 1–10% of the immunoprecipitated fraction.
SDS-page and immunoblotting
For western blot analysis, proteins were resolved on 8–15% SDS-PAGE, blotted onto PVDF membrane (Biorad) and probed with specific antibodies after saturation. The antibodies (and their dilution) used in this study were Ubx (home-made, 1/200), Cg (generously provided by William Brook, 1/500), Histone 3 (1791 Abcam, 1/10,000), GFP (A11122 Life Technologies, 1/3000), myc (SC40 Santa Cruz, 1/500e), Streptavidin-HRP (RPN1231 GE-healthcare, 1/500e), CtBP (generously provided by David Arnosti, 1/500e), Zld (generously provided by Julia Zeitlinger, 1/500e), Tin (generously provided by Manfred Frasch, 1/1000e), M1BP (generously provided by Andy Saurin, 1/500e), Pc (generously provided by Jürg Müller, 1/200e), Tubulin (MCA77G Serotec/Biorad, 1/2000e), HA (3724 Cell Signaling, 1/3000e), V5 (13202 Cell Signaling, 1/3000e), GST (2624 Cell Signaling, 1/5000e), Flag-M2 (F1804 Sigma, 1/1000e), Med19 (generously provided by Muriel Boube, 1/500e). Developing was performed using chemiluminescence reaction (ECL, GE-Healthcare) with secondary coupled to HRP (Promega, 1/5000e).
Protein purification and GST pull-down
All His-tagged and GST-tagged proteins were cloned for this study in pET or pGEX-6P plasmids, respectively. His- and GST-tagged proteins were produced from BL-21 (RIPL) bacterial strain, purified on Ni-NTA agarose beads (Qiagen) or Gluthatione-Sepharose beads (GE-Healthcare) and quantified by Coomassie staining. His-tagged proteins were specifically eluted from the beads with Imidazole. In vitro interaction assays were performed with equal amounts of GST or GST fusion proteins in affinity buffer (20 mM HEPES, 10 μM ZnCl2, 0.1% Triton, 2 mM EDTA) supplemented with NaCl, 1 mM of DTT, 0.1 mM PMSF and protease inhibitor cocktail (Sigma). Proteins produced in vitro were subjected to interaction assays for 2 h at 4 °C under mild rotation. Bound proteins were washed four times and resuspended in Laemmli buffer for western blot analysis. Input fraction was loaded as indicated.
The 5′-Cy5-labelled complementary oligonucleotides (Eurofin) commercially produced were annealed before reaction. The sequences used for this study were the following: Ubx sites: Cy5-5′-TTCAGAGCGAATGATTTATGACCGGTCAAG-3′. For dpp-labelled probes, PCR-labelling has been used for generating DNA fragments of the 675 bp enhancer with the following primers: F1 (188 bp) Cy5-5′-GGATCCGAAATAGTTAGTGTA-3′ and Cy5-5′-ACCAGGGGTTCTTCTTCGAC-3′, F2 (192 bp) Cy5-5′-CCTGAATCCCGACACAACCC-3′ and Cy5-5′-TAAAACAACGGATCGTGCAT-3′, F3 (150 bp) Cy5-5′-CAATCGCTGTAAATAAATAG-3′ and Cy5-5′-CGGCAAATTGCAGCGCGCAT-3′, F4 (145 bp) Cy5-5′-CCATTCGGCTCAACAGTTAT-3′ and Cy5-5′-GTGGGCCACAAATCAAATTG-3′. The F3-fragment was further used for the study. The binding reaction was performed for 20 min in a volume of 30 μl containing 1x Binding Buffer (20 mM Hepes pH 7.9, 1.4 mM MgCl2, 1 mM ZnSO4, 40 mM KCl, 0.1 mM EDTA, 5% Glycerol), 0.2 μg Poly(dI-dC), 0.1 μg BSA, 10 mM DTT and 0.1% NP40. For each reaction His-purified proteins were used. Antibodies were added as indicated for 10 additional min (13202, Cell Signaling, V5; 2396, Cell Signaling, MBP). Separation was carried out (200 V, 50 min for 30 bp, 150 V, 1h15 for >100 bp probes) at 4 °C on a 6% acrylamide gel in 0.5x Tris-borate-EDTA buffer to visualize complex formation by retardation. Cy5-labelled DNA-protein complexes were detected by fluorescence using INTAS Imager.
Similar to co-immunoprecipitation, dechorionated embryos (staged at 29 °C, according to Fig. 1b) were rinsed with Embryo Collection Buffer (0.7% NaCl; 0.1% Triton) and embryos pellet were frozen (−80 °C). Pellets were resuspended in buffer A, dounced 40 times with loose-, 10 times with tight-pestle and transfer through miracloth membrane to new tube. Lysates were incubated with 0.1% Triton and centrifuged 1500 × g, 5 min at 4 °C. Nuclear pellets were washed with Buffer A and centrifuged once more. Nuclear pellets were then resuspended with buffer B, sonicated (Picoruptor, Diagenode), treated with Benzonase and centrifuged at maximum speed. For each affinity purification (AP), 3–6 mg of nuclear extracts were used and two AP were combined for each samples. Protease-resistant streptavidin beads (patent pending) were equilibrated with two PBS washes and resuspended in RIPA buffer supplemented with 1% SDS (50 mM Tris pH 8, 150 mM NaCl, 0.5% sodium deoxycholate, 1% NP40, 1% SDS). Clear nuclear extracts were incubated with 60 µl of streptavidin beads in a final volume of 1.5 ml RIPA-SDS for 4–5 h. Beads were then washed twice with SDS-Buffer (10 mM Tris.HCl, 1 mM EDTA, 1% SDS, 200 mM NaCl), twice with RIPA-SDS and twice with acetonitrile buffer (20% acetonitrile in MS-grade water).
Mass spectrometry preparation
Streptavidin beads were resuspended in 14 µl of ammonium bicarbonate 50 mM and proteins were subjected to reduction with 1 µl DTT (100 µM) at 60 °C for 15 min followed by alkylation with 1 µl of Iodoacetamide (IAA 200 mM) for 45 min at room temperature in the dark. Protein digestion was performed on beads with a Trypsin/LysC mix (Promega, V5071) at 37 °C for 14 h. Peptides were de-salted using the SP3 protocol as previously described78,79,80. Peptides were eluted in trifluoroacetic acid (TFA) 0.1% and loaded on a trap column (Thermo acclaim pepmap 100, 100 μm × 20 mm) (PepMap100 C18 Nano-Trap 100 µm × 20 mm) and separated over a 50 cm analytical column (Waters nanoEase BEH, 75 μm × 250 mm, C18, 1.7 μm, 130 Å) using the Thermo Easy nLC 1200 nanospray source (Thermo EasynLC 1200, Thermo Fisher Scientific). Solvent A was water with 0.1% formic acid and solvent B was 80% acetonitrile, 0.1% formic acid. During the elution step, the percentage of solvent B increased in a linear fashion from 3 to 8% in 4 min, then increased to 10% in 2 min, to 32% in 68 min, to 50% in 12 min and finally to 100% in a further 1 min and went down to 3% for the last 11 min. Peptides were analyzed on a Tri-Hybrid Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific) operated in positive (+2 kV) data dependent acquisition mode with HCD fragmentation. The MS1 and MS2 scans were acquired in the Orbitrap and ion trap, respectively, with a total cycle time of 3 s. MS1 detection occurred at 120,000 resolution, AGC target 1E6, maximal injection time 50 ms and a scan range of 375–1500 m/z. Peptides with charge states 2–4 were selected for fragmentation with an exclusion duration of 40 s. MS2 occurred with CE 33%, detection in topN mode and scan rate was set to Rapid. AGC target was 1E4 and maximal injection time allowed of 50 ms. Data were recorded in centroid mode.
Short egg collection (3–8 h at 29 °C) followed by 12–18 h of additional development was dechorionated and transferred in glass vial with 4 ml Heptane/4 ml Methanol and shaked vigorously. Embryos/larvae were then washed four times with methanol and four times with water containing 0.1% tween. Larvae were subsequently mounted between glass in Hoyer’s medium and incubated for 2–3 days at 60 °C. Photographs were performed with Axio Imager.M1 (Zeiss), objective ×40 using brightfield. For microscopy, all analysis were performed with Fiji (Fiji is Just ImageJ).
Immunofluorescence, in situ hybridization and imaging
For immunostainings2, embryos were dechorionated, fixed with formaldehyde supplemented with heptane and vitelline membrane removed using methanol. Embryos were washed in PBS-tween 0.1%, blocking was performed with BSA 1% in PBS-tween and primary antibodies were incubated overnight. Secondary antibodies coupled to fluorescent protein (1/200e, Jackson) were further incubated for 2 h the following day and embryos mounted in Vectashield-DAPI. The following antibodies were used: Elav (1/50, DSHB), GFP (1:300, Invitrogen, A11122), Myc (1/300, Santa Cruz, SC40), Ubx (1/100e, Home-made), Cg (1/200e, generously provided by William Brook), CtBP (1/1000, generously provided by David Arnosti), Zld (1/500e, generous gift from Julia Zeitlinger), Tin (1/1000e, generous gift from Manfred Frasch), Grh (1/100e, generous gift from Bill McGinnis), Beta-Galactosidase (1/1000e, Promega, Z3783), Digoxigenin (1/1000e, Roche), Tm1 (1/1000e, Abcam, ab50567), Fasciclin2 (1/50e, DSHB), Engrailed (En) (1/2.5, DSHB), Deadpan (Dpn, generous gift from Jürgen Knoblich and Ana Rogulja-Ortmann). Streptavidin (1/500e, Perkin-Elmer) was revealed with the TSA system (Perkin-Elmer). Images were acquired on the Leica SP8 Microscope using a standard ×20 and ×63 objectives. The collected images were analyzed and processed with the Leica program and Fiji.
For dpp transcript quantification, all pictures were treated and analysed with unique parameters. A stack of six z-slices (=9 µm) containing the signal of interest was selected to generate a ‘Maximum Intensity Z-projection’. Background was subtracted from the ‘Maximum Intensity Z-projection’. A relative signal was obtained by the ratio of mean grey values of 488 channel to mean grey value of the DAPI channel of the region of interest. A relative background was obtained identically using the same ROI outside of the dpp signal. Finally, ‘relative signal over background’ was obtained from the ratio of ‘relative signal’ to ‘relative background’. All together the calculation can be summarized by the following formula: relative signal/background = mean grey value (Alexa488/DAPI)signal/(Alexa488/DAPI) background. For Tm1 and Fas2 staining, ‘maximum intensity Z-projections’ were created using Z-stack of 1.1 µm, and, respectively, 15–25 slides and whole embryos stack using Fiji. Embryos were selected for heterozygous or homozygous genotype according to β-galactosidase expression driven via balancer chromosome (CyO-wg>LacZ, TM6-Dfd>LacZ). Quantifications were performed by blind observation of muscle patterns for 50 embryos per genotype (including heterozygous and homozygous mutants without distinction). Different categories of phenotypes were proposed according to the blind observation performed: strong, medium, mild and normal pattern. Homozygous embryos for balancer chromosomes were not always included as the general shape of the embryos was altered, thus modifying the theoretical percentage of penetrance according to genetic laws. Taking into consideration this parameter, the percentage window of genotype–phenotype of the different fly lines were the following:
1. Ubx1,tin346/TM6,Dfd>LacZ, brm2,Ubx1/TM6-Dfd>lacZ:
Ubx1,tin346 and brm2,Ubx1 homozygous: 25–33% (only strong phenotype).
Ubx1,tin346/TM6,Dfd>LacZ and brm2,Ubx1/TM6,Dfd>LacZ heterozygous: 50–66% (hardly visible (normal) to mild phenotype).
Balancer homozygous (TM6-Dfd>lacZ /TM6,Dfd>LacZ): 1–25% (normal phenotype or too altered).
2. Srp54DG02112/CyO-wg>lacZ;Ubx1/TM6-Dfd>lacZ, snRNPU1-70K02107/CyO-wg>lacZ; Ubx1/TM6-Dfd>lacZ:
Double homozygous Srp54DG02112;Ubx1 and snRNPU1-70K02107;Ubx1: 6.25–11% (strong phenotype).
Srp54DG02112/CyO-wg>lacZ;Ubx1 and snRNPU1-70K02107/CyO-wg>lacZ;Ubx1: 12.5–22% (medium to strong phenotype).
Srp54DG02112/CyO-wg>lacZ;Ubx1/TM6,Dfd>LacZ and snRNPU1-70K02107/CyO-wg>lacZ;Ubx1/TM6,Dfd>LacZ: 12.5–22% (medium to strong phenotype).
Double heterozygous Srp54DG02112;Ubx1/TM6,Dfd>LacZ and snRNPU1-70K02107;Ubx1/TM6,Dfd>LacZ: 25-44% (mild phenotype).
Single heterozygous: 1–37.5% (normal phenotype).
Homozygous for balancers: 1–6.25% (normal phenotype or too altered).
Deadpan staining was used for neural cells and neuroblasts quantification and Engrailed for marking the segment boundaries. The numbers of cells per segments were counted, using Z-stack of stage 17 embryos in ventral position.
Fasciclin 2 staining from ×63 focal length was used to analyse motoneurons phenotype by quantification of the innervation of the first ventral-lateral muscle (VL1) of abdominal A2–A7 segments. Phenotypes were classified as followed: normal, misrouted/no connexion, reduced connexion, for which innervation is reaching the muscle but no connexion is observed. Statistical analyses were performed using one-way ANOVA and Chi2 test.
Mass spectrometry analysis
Each experiment included nlsGFP, UbxWT and UbxN51A samples and was performed in four independent biological replicates. Raw mass spectrometry data were analysed using MaxQuant free software including the Andromeda search Engine81,82,83. Peptide identification was performed using Uniprot database of Drosophila melanogaster (canonical and isoform). Default parameters of MaxQuant were used with the following modifications: digestion by Trypsin/P and LysC, lysine biotinylation as variable modification (as well as methionine oxidation and N-terminal acetylation), cytosine carbamidomethylation as fixed modification, Instrument set Orbitrap (with precursor tolerance 20 ppm, MS tolerance 0.5 Da), match between runs option was activated, FDR 1%, label-free quantification (LFQ) and iBAQ calculated (Supplementary Data 1–3). Protein enrichment was calculated using the LFQ Log2 ratio (WT/GFP, N51A/GFP) and normalized on the median value (Supplementary Data 4–33). For each ratio, distribution (90%) and corresponding standard deviations (SD) were calculated to define the proteins significantly enriched (ratio > confidence interval defined as median ± 2 SD). Imputation of value divided by 0 (referred to infinite) has been performed for confidence intervals calculation (Supplementary Data 4–27). Each ratio is then referred as a replicate, related to a list of protein significantly enriched (Supplementary Data 28–34). Subsequently, proteins significantly enriched in at least 2 replicates were considered biologically relevant taking into account biological variability and stochasticity of the MS-process and used for further analysis (Supplementary Data 28–35). Enriched proteins from the different ratio (WT/GFP, N51A/GFP) were then compared with discriminate proteins enriched in the chromatin fraction (WT/GFP, N51A/GFP excluded), from the one enriched more generally in the nucleus (WT/GFP + N51A/GFP) and the one enriched more freely in the nucleoplasm (N51A/GFP, WT excluded) (Supplementary Data 36–38).
Data analysis and visualisation
For proteome analysis, Perseus free software was used to generate dot-plot (Pearson, Valid pair value) and clustering visualization (heat map and PCA)84, based on LFQ log10 value of protein expressed after Perseus canonical filtering (Reverse, Potential Contaminant, Only identified by site) and replacement of missing values.
Functional networks of Ubx interactome were generated with STRING software85, based on 0.150 interaction score of experimental evidence and database and pathway co-occurrence. Visualization of networks was built with Cytoscape free software86.
For GO-Term annotations and over-represented GO-Term related to biological process analysis was performed with the web-tools PANTHER using Fisher test and FDR correction. Comparison of Ubx and Tin45 and Ubx and Grh46 genomic profile was done as described2. The subsequent motif searches on defined regions of 1 kb were performed with the web-tools AME of the MEME-suite with default parameters and fisher test.
Statistical analyses were performed using one-way ANOVA (luciferase assay, signal intensity of mRNA dpp expression level, genetic interaction quantification) and Chi2 test for VL1 innervation phenotype to genotype analysis of motoneuron pattern.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Raw data of MS analysis, Uniprot and contaminant databases and Maxquant files that support the findings of this study have been deposited in PRIDE (https://www.ebi.ac.uk/pride/archive) with the accession code PXD0144818.
Freely accessible datasets used in the study are listed below:
ChIP-on-ChIP of Tin: GSE41628.
ChIP-seq of Grh: GSE83305 using 5–6 h ChIP-seq collection.
Tissue-specific transcriptome and upon Ubx depletion: GSE121670.
Tissue-specific ChIP-seq of Ubx: GSE121752.
Castelli-Gair, J., Greig, S., Micklem, G. & Akam, M. Dissecting the temporal requirements for homeotic gene function. Dev. Camb. Engl. 120, 1983–1995 (1994).
Domsch, K. et al. The Hox transcription factor Ubx stabilizes lineage commitment by suppressing cellular plasticity in Drosophila. eLife 8, e42675 (2019).
Hombría, J. C.-G. & Lovegrove, B. Beyond homeosis—HOX function in morphogenesis and organogenesis. Differentiation 71, 461–476 (2003).
Zhou, Q. et al. A mouse tissue transcription factor atlas. Nat. Commun. 8, 15089 (2017).
Deschamps, J. & Duboule, D. Embryonic timing, axial stem cells, chromatin dynamics, and the Hox clock. Genes Dev. 31, 1406–1416 (2017).
Junion, G. et al. A transcription factor collective defines cardiac cell fate and reflects lineage history. Cell 148, 473–486 (2012).
Koenecke, N., Johnston, J., He, Q., Meier, S. & Zeitlinger, J. Drosophila poised enhancers are generated during tissue patterning with the help of repression. Genome Res. 27, 64–74 (2017).
Zentner, G. E., Tesar, P. J. & Scacheri, P. C. Epigenetic signatures distinguish multiple classes of enhancers with distinct cellular functions. Genome Res. 21, 1273–1283 (2011).
Carnesecchi, J. et al. ERRα induces H3K9 demethylation by LSD1 to promote cell invasion. Proc. Natl Acad. Sci. USA 114, 3909–3914 (2017).
Guruharsha, K. G. et al. A protein complex network of Drosophila melanogaster. Cell 147, 690–703 (2011).
Rhee, D. Y. et al. Transcription factor networks in Drosophila melanogaster. Cell Rep. 8, 2031–2043 (2014).
Carnesecchi, J., Pinto, P. B. & Lohmann, I. Hox transcription factors: an overview of multi-step regulators of gene expression. Int. J. Dev. Biol. 62, 723–732 (2018).
Auboeuf, D., Hönig, A., Berget, S. M. & O’Malley, B. W. Coordinate regulation of transcription and splicing by steroid receptor coregulators. Science 298, 416–419 (2002).
Braun, P. & Gingras, A.-C. History of protein-protein interactions: from egg-white to complex networks. Proteomics 12, 1478–1498 (2012).
Sonawane, A. R. et al. Understanding tissue-specific gene regulation. Cell Rep. 21, 1077–1088 (2017).
Pearson, J. C., Lemons, D. & McGinnis, W. Modulating Hox gene functions during animal body patterning. Nat. Rev. Genet. 6, 893–904 (2005).
Passner, J. M., Ryoo, H. D., Shen, L., Mann, R. S. & Aggarwal, A. K. Structure of a DNA-bound Ultrabithorax-Extradenticle homeodomain complex. Nature 397, 714–719 (1999).
Brodu, V., Elstob, P. R. & Gould, A. P. abdominal A specifies one cell type in Drosophila by regulating one principal target gene. Development 129, 2957–2963 (2002).
Sorge, S. et al. The cis-regulatory code of Hox function in Drosophila: The cis -regulatory code of Hox function in Drosophila. EMBO J. 31, 3323–3333 (2012).
Mann, R. S. & Chan, S. K. Extra specificity from extradenticle: the partnership between HOX and PBX/EXD homeodomain proteins. Trends Genet. 12, 258–262 (1996).
Mann, R. S., Lelli, K. M. & Joshi, R. In Current Topics in Developmental Biology vol. 88 63–101 (Elsevier, 2009).
Merabet, S. & Lohmann, I. Toward a new twist in Hox and TALE DNA-binding specificity. Dev. Cell 32, 259–261 (2015).
Merabet, S. & Mann, R. S. To be specific or not: the critical relationship between Hox and TALE proteins. Trends Genet. 32, 334–347 (2016).
Mann, R. S. & Chan, S. K. Extra specificity from extradenticle: the partnership between HOX and PBX/EXD homeodomain proteins. Trends Genet. 12, 258–262 (1996).
Merabet, S. et al. A unique Extradenticle recruitment mode in the Drosophila Hox protein Ultrabithorax. Proc. Natl Acad. Sci. USA 104, 16946–16951 (2007).
Saadaoui, M. et al. Selection of distinct Hox-Extradenticle interaction modes fine-tunes Hox protein activity. Proc. Natl Acad. Sci. USA 108, 2276–2281 (2011).
Aspland, S. E. & White, R. A. Nucleocytoplasmic localisation of extradenticle protein is spatially regulated throughout development in Drosophila. Dev. Camb. Engl. 124, 741–747 (1997).
Fabre, B. et al. Analysis of Drosophila melanogaster proteome dynamics during embryonic development by a combination of label-free proteomics approaches. Proteomics 16, 2068–2080 (2016).
Kim, D. I. & Roux, K. J. Filling the void: proximity-based labeling of proteins in living cells. Trends Cell Biol. 26, 804–817 (2016).
Strübbe, G. et al. Polycomb purification by in vivo biotinylation tagging reveals cohesin and Trithorax group proteins as interaction partners. Proc. Natl Acad. Sci. USA 108, 5572–5577 (2011).
Waaijers, S. et al. A tissue-specific protein purification approach in Caenorhabditis elegans identifies novel interaction partners of DLG-1/Discs large. BMC Biol. 14, 66 (2016).
Lambert, J.-P., Tucholska, M., Go, C., Knight, J. D. R. & Gingras, A.-C. Proximity biotinylation and affinity purification are complementary approaches for the interactome mapping of chromatin-associated protein complexes. J. Proteom. 118, 81–94 (2015).
Chen, C.-L. et al. Proteomic mapping in live Drosophila tissues using an engineered ascorbate peroxidase. Proc. Natl Acad. Sci. USA 112, 12093–12098 (2015).
Roux, K. J., Kim, D. I., Raida, M. & Burke, B. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J. Cell Biol. 196, 801–810 (2012).
Kim, D. I. et al. Probing nuclear pore complex architecture with proximity-dependent biotinylation. Proc. Natl Acad. Sci. USA 111, E2453–E2461 (2014).
Branon, T. C. et al. Efficient proximity labeling in living cells and organisms with TurboID. Nat. Biotechnol. 36, 880–887 (2018).
Dingar, D. et al. BioID identifies novel c-MYC interacting partners in cultured cells and xenograft tumors. J. Proteom. 118, 95–111 (2015).
Brand, A. H. & Perrimon, N. Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development 118, 401–415 (1993).
Lewis, E. B. A gene complex controlling segmentation in Drosophila. Nature 276, 565–570 (1978).
Rogulja-Ortmann, A., Renner, S. & Technau, G. M. Antagonistic roles for Ultrabithorax and Antennapedia in regulating segment-specific apoptosis of differentiated motoneurons in the Drosophila embryonic central nervous system. Development 135, 3435–3445 (2008).
Graba, Y. et al. Homeotic control in Drosophila; the scabrous gene is an in vivo target of Ultrabithorax proteins. EMBO J. 11, 3375–3384 (1992).
Kazemian, M., Pham, H., Wolfe, S. A., Brodsky, M. H. & Sinha, S. Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development. Nucleic Acids Res. 41, 8237–8252 (2013).
Shokri, L. et al. A comprehensive Drosophila melanogaster transcription factor interactome. Cell Rep. 27, 955–970.e7 (2019).
Yu, X., Lin, J., Zack, D. J. & Qian, J. Computational analysis of tissue-specific combinatorial gene regulation: predicting interaction between transcription factors in human tissues. Nucleic Acids Res. 34, 4925–4936 (2006).
Jin, H. et al. Genome-wide screens for in vivo tinman binding sites identify cardiac enhancers with diverse functional architectures. PLoS Genet. 9, e1003195 (2013).
Nevil, M., Bondra, E. R., Schulz, K. N., Kaplan, T. & Harrison, M. M. Stable binding of the conserved transcription factor grainy head to its target genes throughout Drosophila melanogaster development. Genetics 205, 605–620 (2017).
Azpiazu, N. & Frasch, M. tinman and bagpipe: two homeo box genes that determine cell fates in the dorsal mesoderm of Drosophila. Genes Dev. 7, 1325–1340 (1993).
Chan, S. K., Jaffe, L., Capovilla, M., Botas, J. & Mann, R. S. The DNA binding specificity of Ultrabithorax is modulated by cooperative interactions with extradenticle, another homeoprotein. Cell 78, 603–615 (1994).
Capovilla, M., Brandt, M. & Botas, J. Direct regulation of decapentaplegic by Ultrabithorax and its role in Drosophila midgut morphogenesis. Cell 76, 461–475 (1994).
Sun, B., Hursh, D. A., Jackson, D. & Beachy, P. A. Ultrabithorax protein is necessary but not sufficient for full activation of decapentaplegic expression in the visceral mesoderm. EMBO J. 14, 520–535 (1995).
Bodmer, R. The gene tinman is required for specification of the heart and visceral muscles in Drosophila. Dev. Camb. Engl. 118, 719–729 (1993).
Baëza, M. et al. Inhibitory activities of short linear motifs underlie Hox interactome specificity in vivo. eLife 4, e06034 (2015).
Hessinger, C., Technau, G. M. & Rogulja-Ortmann, A. The Drosophila Hox gene Ultrabithorax acts in both muscles and motoneurons to orchestrate formation of specific neuromuscular connections. Development 144, 139–150 (2017).
Michelson, A. M. Muscle pattern diversification in Drosophila is determined by the autonomous function of homeotic genes in the embryonic mesoderm. Dev. Camb. Engl. 120, 755–768 (1994).
Prokop, A., Bray, S., Harrison, E. & Technau, G. M. Homeotic regulation of segment-specific differences in neuroblast numbers and proliferation in the Drosophila central nervous system. Mech. Dev. 74, 99–110 (1998).
Monedero Cobeta, I., Salmani, B. Y. & Thor, S. Anterior-posterior gradient in neural stem and daughter cell proliferation governed by spatial and temporal Hox control. Curr. Biol. CB 27, 1161–1172 (2017).
Almeida, M. S. & Bray, S. J. Regulation of post-embryonic neuroblasts by Drosophila Grainyhead. Mech. Dev. 122, 1282–1293 (2005).
Karlsson, D., Baumgardt, M. & Thor, S. Segment-specific neuronal subtype specification by the integration of anteroposterior and temporal cues. PLoS Biol. 8, e1000368 (2010).
Cenci, C. & Gould, A. P. Drosophila Grainyhead specifies late programmes of neural proliferation by regulating the mitotic activity and Hox-dependent apoptosis of neuroblasts. Dev. Camb. Engl. 132, 3835–3845 (2005).
Oughtred, R. et al. The BioGRID interaction database: 2019 update. Nucleic Acids Res. 47, D529–D541 (2019).
Bischof, J. et al. Generation of a versatile BiFC ORFeome library for analyzing protein–protein interactions in live Drosophila. eLife 7, e38853 (2018).
Bondos, S. E., Tan, X.-X. & Matthews, K. S. Physical and genetic interactions link Hox function with diverse transcription factors and cell signaling proteins. Mol. Cell. Proteom. 5, 824–834 (2006).
Liu, Y., Matthews, K. S. & Bondos, S. E. Multiple intrinsically disordered sequences alter DNA binding by the homeodomain of the Drosophila Hox protein ultrabithorax. J. Biol. Chem. 283, 20874–20887 (2008).
Hsiao, H.-C. et al. The intrinsically disordered regions of the Drosophila melanogaster Hox protein ultrabithorax select interacting proteins based on partner topology. PLoS ONE 9, e108217 (2014).
Maiti, S. et al. Dynamic studies on intrinsically disordered regions of two paralogous transcription factors reveal rigid segments with important biological functions. J. Mol. Biol. 431, 1353–1369 (2019).
Darling, A. L. & Uversky, V. N. Intrinsic disorder and posttranslational modifications: the darker side of the biological dark matter. Front. Genet. 9, 158 (2018).
Draime, A., Bridoux, L., Graba, Y. & Rezsohazy, R. Post-translational modifications of HOX proteins, an underestimated issue. Int. J. Dev. Biol. 62, 733–744 (2018).
Gavis, E. R. & Hogness, D. S. Phosphorylation, expression and function of the Ultrabithorax protein family in Drosophila melanogaster. Dev. Camb. Engl. 112, 1077–1093 (1991).
de Almeida, S. F. & Carmo-Fonseca, M. Design principles of interconnections between chromatin and pre-mRNA splicing. Trends Biochem. Sci. 37, 248–253 (2012).
Naftelberg, S., Schor, I. E., Ast, G. & Kornblihtt, A. R. Regulation of alternative splicing through coupling with transcription and chromatin structure. Annu. Rev. Biochem. 84, 165–198 (2015).
Perino, M. & Veenstra, G. J. C. Chromatin control of developmental dynamics and plasticity. Dev. Cell 38, 610–620 (2016).
Oesterreich, F. C., Bieberstein, N. & Neugebauer, K. M. Pause locally, splice globally. Trends Cell Biol. 21, 328–335 (2011).
Allemand, E. et al. A broad set of chromatin factors influences splicing. PLoS Genet. 12, e1006318 (2016).
Tsai, A. et al. Nuclear microenvironments modulate transcription from low-affinity enhancers. eLife 6, e28975 (2017).
Boija, A. et al. Transcription factors activate genes through the phase-separation capacity of their activation domains. Cell 175, 1842–1855 (2018).
Hnisz, D., Shrinivas, K., Young, R. A., Chakraborty, A. K. & Sharp, P. A. A phase separation model for transcriptional control. Cell 169, 13–23 (2017).
Sabari, B. R. et al. Coactivator condensation at super-enhancers links phase separation and gene control. Science 361, eaar3958 (2018).
Rafiee, M.-R., Girardot, C., Sigismondo, G. & Krijgsveld, J. Expanding the circuitry of pluripotency by selective isolation of chromatin-associated proteins. Mol. Cell 64, 624–635 (2016).
Hughes, C. S. et al. Ultrasensitive proteome analysis using paramagnetic bead technology. Mol. Syst. Biol. 10, 757–757 (2014).
Hughes, C. S. et al. Single-pot, solid-phase-enhanced sample preparation for proteomics experiments. Nat. Protoc. 14, 68–85 (2019).
Cox, J. et al. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell. Proteom. 13, 2513–2526 (2014).
Tyanova, S. et al. Visualization of LC-MS/MS proteomics data in MaxQuant. Proteomics 15, 1453–1456 (2015).
Tyanova, S., Temu, T. & Cox, J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat. Protoc. 11, 2301–2319 (2016).
Tyanova, S. et al. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 13, 731–740 (2016).
von Mering, C. STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res. 33, D433–D437 (2004).
Shannon, P. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
We thank the Bloomington Center for fly lines, the DHSB for antibodies, and for plasmids the Drosophila Genomics Resource Center, supported by NIH grant 2P40OD010949. We further thank for sharing their materials, Muriel Boube (antibody Med19), Julia Zeitlinger (antibody Zelda), William J. Brook (antibody Cg), Manfred Frasch (Tin antibody, tin346 mutant fly line and Tin cDNA construct used for further cloning), David Arnosti (antibody CtBP), Jurg Müller (antibody Pc), Andy Saurin (antibody M1BP), Bill McGinnis (antibody Grh). We warmly thank Ana Rogulja-Ortmann for the dpp674-luciferase construct, scabrous-GAL4 line, for providing her expertise on the neural system during the revision and the Deadpan antibody (generous gift from Jürgen Knoblich). We are very grateful for all the people who helped to improve the manuscript, in particular Julien Bethune, Guido Grossmann, Jan Lohmann, Justin Crocker, Gislene Pereira and Pedro Pinto, who also provided strong support for microscopy acquisition. This project was supported in part by CellNetworks—Cluster of Excellence (EXC81, J.K.) and DFG LO 844/8-1 (I.L.).
The authors declare no competing interests.
Peer review information Nature Communications thanks Erik Soderblom, Stefan Thor and the other, anonymous, reviewer for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Carnesecchi, J., Sigismondo, G., Domsch, K. et al. Multi-level and lineage-specific interactomes of the Hox transcription factor Ubx contribute to its functional specificity. Nat Commun 11, 1388 (2020). https://doi.org/10.1038/s41467-020-15223-x
Nature Methods (2021)