A prelude to the proximity interaction mapping of CXXC5

CXXC5 is a member of the zinc-finger CXXC family proteins that interact with unmodified CpG dinucleotides through a conserved ZF-CXXC domain. CXXC5 is involved in the modulation of gene expressions that lead to alterations in diverse cellular events. However, the underlying mechanism of CXXC5-modulated gene expressions remains unclear. Proteins perform their functions in a network of proteins whose identities and amounts change spatiotemporally in response to various stimuli in a lineage-specific manner. Since CXXC5 lacks an intrinsic transcription regulatory function or enzymatic activity but is a DNA binder, CXXC5 by interacting with proteins could act as a scaffold to establish a chromatin state restrictive or permissive for transcription. To initially address this, we utilized the proximity-dependent biotinylation approach. Proximity interaction partners of CXXC5 include DNA and chromatin modifiers, transcription factors/co-regulators, and RNA processors. Of these, CXXC5 through its CXXC domain interacted with EMD, MAZ, and MeCP2. Furthermore, an interplay between CXXC5 and MeCP2 was critical for a subset of CXXC5 target gene expressions. It appears that CXXC5 may act as a nucleation factor in modulating gene expressions. Providing a prelude for CXXC5 actions, our results could also contribute to a better understanding of CXXC5-mediated cellular processes in physiology and pathophysiology.

CXXC5 associated proteins in MCF7 cells. Based on the synthesis and intracellular location of the functional 3F-CXXC5-BirA*-HA fusion protein, we then carried out BioID assays in MCF7 cells. Cells were transiently transfected with the expression vector bearing none (EV), the BirA*-HA, or the Flag-CXXC5-BirA*-HA cDNA for 24 h. Cells were then treated in the absence or presence of 50 μM biotin and 1 mM ATP for 16 h. Biotinylated proteins in cell lysates were captured with streptavidin-conjugated magnetic beads. Protein fragments following on-bead tryptic proteolysis of the captured proteins were subjected to mass spectrometry (MS). www.nature.com/scientificreports/ Subtractive analyses of identified proteins from EV, BirA*-HA, and 3F-CXXC5-BirA*-HA synthesizing cells as two biological replicates with two technical repeats revealed 108 proximal interactors of CXXC5 (Supplementary Information, Table S1). It should be noted that none of the proteins we identified is coincident with reported protein partners of CXXC5 except for TNRC18 42 which was present in one of the biological replicates of our BioID experiments. Differences in protein profiles are likely due to dynamically changing protein abundances and protein complex compositions in distinct cell types. It is also possible that the genetic fusion of CXXC5 to BirA* altered the native conformational features of the protein, thereby modifying the in cellula protein interaction profile of the protein.
Gene ontology analyses for biological functions using the DAVID bioinformatics tool 43 suggest that CXXC5 interacts with proteins largely grouped in the regulation of gene expression, which can further be sub-grouped into proteins involved in DNA, chromatin, and RNA processes ( Supplementary Information, Fig. S2). Proteins identified as transcription factors includes ADNP, AP2A (TFAP2A), BCLAF1, CTCF, CUX1, DIDO1, ELF1, EMSY (C11orf30), GRHL2, LIN54, MAZ, NFIB, NFIX, NR2C2, RFX1, RREB1, SCML2, TRPS1, ZNF148, and ZNF638. Transcription co-regulatory proteins comprise CCAR2, HCFC1, LYRIC, MKL2, SNW1, SP110, TCF20, and TIF1B (TRIM28). DNA and Chromatin modifiers include BAZ1A, CHD4, CHD8, COR1B, HMGX4, KDM2A (CXXC8), KMT2A (CXXC7), KMT2B (CXXC10), GATAD2A, GATAD2B, MeCP2, NSD2, RUVB1, SMCA5, SMHD1, TOP2A, TOP2B, TOX4, and XRCC6. The group of proteins involved in RNA processing, binding, and transport encompasses CPSF6, DDX21, DDX5, DHX9, DKC1, HNRPR, NAT10, NONO, NUCL (NCL), PAIRB, ROA2, SF3B2, SRRM1, TADBP, and THOC4. Also, the proximity interaction partners of CXXC5 include architectural proteins Emerin (EMD) as well as LAP2α and LAP2β, both of which are splice variants encoded by TMPO. Total protein extracts of MCF7 cells were subjected to SDS-10%PAGE followed by WB using the Flag, the HA, or the Biotin antibody followed by an HRP-conjugated goat-anti mouse secondary antibody for the Flag (Advansta R-05071-500) or goat-anti-rabbit secondary antibody for the HA or the Biotin antibody (Advansta R-05072-500). Molecular masses (MM) in kDa are indicated. www.nature.com/scientificreports/ Interaction of CXXC5 with EMD and MAZ. The initial validation of interactions between the putative interactors with the endogenous CXXC5 by the use of immunoprecipitation in MCF7 cells proved to be difficult. This was due to low levels of endogenous CXXC5 synthesis and the efficiency of the available antibodies from different resources for the immunoprecipitation of CXXC5 ( Supplementary Information, Fig. S3). To circumvent these problems, we used the Human Embryonic Kidney 293 (HEK293) cells that exhibit high transfection efficiency 44 . Initial screening of some of the CXXC5 proximity interacting partners identified with BioID by the use of transient transfections followed by co-immunoprecipitation (Co-IP) in HEK293 cells revealed that CXXC5 could interact, for example, with EMD, MAZ, and MeCP2 proteins but not with LAP2α (Thymopoietin; Lamina-Associated Polypeptide 2, Isoform alpha), RUVBL1 (RuvB Like AAA ATPase 1) or SNW1 (SNW Domain Containing 1) protein ( Supplementary Information, Fig. S4). Based on these observations, we selected EMD, MAZ, and MeCP2 to verify that they are indeed interacting protein partners of CXXC5. EMD, a member of the nuclear lamina-associated protein family 45 , is a serine-rich inner nuclear membrane protein with a predicted molecular mass (MM) of 28.9 kDa and is involved in the organization of chromatin structure, nuclear assembly, and gene expressions 46 . To validate the interaction between CXXC5 and EMD, we transiently transfected HEK293 cells with the expression vector bearing 3F-CXXC5 and/or HA-EMD cDNA for 48 h (Fig. 2). We also transiently transfected cells with expression vectors bearing cDNAs with converse tag sequences: 3F-EMD and/or HA-CXXC5 cDNA to ensure that the nature of tags does not alter the intracellular localization and/or interactions of the putative protein partners (data not shown). 3F-CXXC5 is primarily localized to the nucleus, whereas HA-EMD shows a nuclear membrane/periphery staining that partially overlaps with the staining of 3F-CXXC5 as well (Fig. 2a & Inset). HA-EMD or 3F-EMD, in transfected cells, shows three distinct protein species with discrete electrophoretic migration ranging from 35 to 55 kDa, in contrast to 3F-CXXC5 or HA-CXXC5, both of which display a single electrophoretic species with an MM of about 37 kDa (Fig. 2b,c). Although protein species of EMD were not studied further, they likely represent isoforms with differentially processed post-translational modifications [47][48][49] . Immunoprecipitation of nuclear extracts of transiently transfected cells that co-synthesize 3F-CXXC5 and HA-EMD using the HA antibody (Fig. 2d,e; Supplementary  Information, Fig. S5) together with protein A and G magnetic beads followed by immunoblotting with the Flag (Fig. 2d) or the HA (Fig. 2e) antibody indicates the presence of both EMD and CXXC5 in the immunoprecipitants. This demonstrates that CXXC5 and EMD interact. Interestingly, CXXC5 displayed an interaction only with the 35 kDa EMD species (Fig. 2d,e).
MAZ is a transcription factor with C2H2-type zinc finger motifs that can bind GC-rich promoters of target genes to control transcriptional processes 50 . We had carried out the initial screening of the CXXC5 interaction Interaction of EMD and CXXC5. (a) To assess the intracellular localization of EMD and CXXC5 when co-synthesized, HEK293 cells were transiently transfected for 48 h with the expression vector bearing 3F-CXXC5 and HA-EMD cDNA. Cells were then subjected to ICC using the Flag (green channel) or the HA (red channel) antibody. DAPI was used for DNA staining. The scale bar is 10 μm. Inset indicates a section with higher magnification. (b,c) To examine the protein synthesis, HEK293 cells were transfected with the expression vector bearing (b) 3F-CXXC5 and/or HA-EMD cDNA; or (c) HA-CXXC5 and/or 3F-EMD cDNA for 48 h. The synthesis of proteins was assessed by WB using the HA or the Flag antibody. HDAC1 used as a loading control was probed with the HDAC1 antibody. Star denotes distinct EMD species (d,e). The nuclear extracts (500 μg) of transiently co-transfected HEK293 cells were subjected to Co-IP with the HA (d) or the isotype-matched IgG. 50 μg of nuclear lysates was used as input control. The precipitates were subjected to SDS-10%PAGE followed with WB using analyzed using the Flag (d) or the HA (e) antibody. Molecular masses (MM) in kDa are indicated. www.nature.com/scientificreports/ with MAZ using a MAZ cDNA that encodes an amino-terminally truncated variant with an estimated MM of 28 kDa (MAZ ΔN ), whereas the full-length MAZ is about 51 kDa (Fig. 3a). We observed in transiently transfected HEK293 cells that HA-MAZ and HA-MAZ ΔN display electrophoretic mobility of 55 and 33 kDa, respectively (Fig. 3b). In transiently co-transfected HEK293 cells, the nuclearly localized (Fig. 3c) and co-synthesized 3F-CXXC5 and HA-MAZ (Fig. 3d) or HA-MAZ ΔN (Fig. 3e) showed interactions, as 3F-CXXC5 was immunoprecipitated with the HA antibody ( Fig. 3f) in the co-presence of HA-MAZ or HA-MAZ ΔN (Fig. 3g). These results indicate that MAZ, through the carboxyl-terminus, interacts with CXXC5.
Interaction of CXXC5 with MeCP2. As a member of the methyl-CpG binding protein family (MBP) with a conserved methyl-cytosine binding domain (MBD), MeCP2 with a predicted MM of 53 kDa binds to methylated CpG dinucleotides and unmethylated DNA 3 in contrast to CXXC5 which preferentially interacts with unmethylated CpG dinucleotide containing DNA 7,21 . The binding of MBPs to DNA as DNA methylation readers alters chromatin structure by recruiting chromatin remodelers and histone modifiers to modulate gene expressions 3 .
To examine the interaction of CXXC5 with MeCP2, HEK293 cells were transiently transfected with an expression vector bearing the 3F-CXXC5, or HA-CXXC5, and/or HA-MeCP2, or 3F-MeCP2, cDNA for 48 h. In HEK293 cells, HA-MeCP2, as 3F-CXXC5, localizes to the nucleus (Fig. 4a). Immunoblotting of nuclear extracts of transiently transfected HEK293 cells revealed that HA-MeCP2 or 3F-MeCP2 primarily displays electrophoretic mobility of about 80 kDa, as shown previously 51 , when synthesized alone or together with 3F-CXXC5 or HA-CXXC5 (Fig. 4b,c). The apparent higher MM than the estimated MM of MeCP2 is likely due to different Nuclear extracts were subjected to SDS-10%PAGE followed by WB using the HA or the Flag antibody. (c) To assess the intracellular localization of the proteins, HEK293 cells were transiently transfected for 48 h with the expression vector bearing the 3F-CXXC5 and the HA-MAZ or HA-MAZ ΔN cDNA. Cells were then subjected to ICC using the Flag (green channel) or the HA (red channel) antibody. DAPI staining indicates the nucleus. The scale bar is 10 μm. (d,e) To assess the co-synthesis of proteins, the nuclear extracts of HEK293 cells, transiently transfected with the expression vector bearing 3F-CXXC5 and/or (d) the HA-MAZ or (e) the HA-MAZ ΔN cDNA, were subjected to WB analyses. Proteins were immunoblotted (IB) with the Flag or HA-antibody. HDAC1 used as a loading control was probed with the HDAC1 antibody. (f,g) The nuclear extracts, 500 µg, of transiently co-transfected HEK293 cells were subjected to Co-IP with the HA or the isotype-matched IgG. 10% of nuclear lysate was used as input control. The precipitates were subjected to SDS-10%PAGE followed with WB using the Flag (f) or the HA (g) antibody. Molecular masses (MM) in kDa are indicated. www.nature.com/scientificreports/ post-translational modifications of the protein 52 . Immunoprecipitation of 3F-CXXC5 or 3F-MeCP2 with the HA antibody ( Fig. 4d) in the presence of HA-MeCP2 or HA-CXXC5 in immunoprecipitants, respectively, (Fig. 4e) suggests that CXXC5 and MeCP2 are interacting partners. The interaction between the unmethylated CpG dinucleotide binder CXXC5 and the DNA methylation reader and histone modifier MeCP2 is perplexing and enticed us to further explore the feature of this interaction. To extend the verification that CXXC5 and MeCP2 are interacting partners, we carried out the proximity ligation assay (PLA). PLA used here utilizes species-specific secondary antibodies conjugated with distinct DNA primers. A hybridization step followed by circular DNA amplification with fluorescent probes to the conjugated DNA primers allows the visualization of proximity spots by fluorescence microscopy 53 . In transiently transfected HEK293 cells synthesizing HA-MeCP2 and/or 3F-CXXC5, the HA or the Flag antibody alone showed virtually no fluorescence signal in cells whereas prominent nuclear fluorescence signals were detectable when cells were probed with both antibodies (Fig. 5a, Supplementary Information Fig. S6). This observation reinforces the conclusion that CXXC5 and MeCP2 are interacting partners.
Moreover, ChIP of cell extracts from HEK293 cells transiently transfected with 3F-CXXC5 and HA-MeCP2 with the Flag antibody followed by immunoblotting using the HA antibody and re-probing with the Flag antibody suggests that CXXC5 and MeCP2 are co-present on chromatin as well (Fig. 5b).
Assessing sub-regions critical for CXXC5-MeCP2 interactions. Based on these results, we wanted to explore a sub-region(s) of CXXC5 critical for the interaction with MeCP2. We generated cDNAs encoding amino-terminally, carboxyl-terminally, or internally truncated CXXC5 proteins. Since CXXC5 localizes to the nucleus through a nuclear localization signal present at the immediate amino-terminus of the CXXC domain 28 , to ensure that truncated CXXC5 variants lacking the CXXC domain also localize to the nucleus by inserting an exogenous nuclear localization signal (eNLS) derived from the SV40 T antigen 54 between the Flag epitope and a truncated CXXC5 variant, namely 3F-eCXXC5Δ 250-322 and 3F-eCXXC5Δ 1-100&250-322 (Fig. 6a). To examine the synthesis and intracellular location of CXXC5 variants, we performed WB from and ICC in transiently transfected HEK293 cells with the use of the Flag antibody. Results revealed that CXXC5 variants were synthesized at expected MMs ( Fig. 6b) and were localized primarily to the nucleus (Fig. 6c). To assess a sub-region(s) of CXXC5 critical for MeCP2 interaction, HEK293 cells transiently co-transfected with the expression vector bearing a 3F-CXXC5 variant and the HA-MeCP2 cDNA for 48 h were subjected to Co-IP. The nuclear extracts were immunoprecipitated with the HA antibody and immunoblotted with the Flag antibody followed by re-probing with the HA antibody (Fig. 6d). Results showed that 3F-eCXXC5Δ 250-322 with the deleted CXXC domain was not detectable in the precipitants. Similarly, 3F-eCXXC5Δ 1-100 & 250-322 lacking the first 100 amino acids at the amino-terminus www.nature.com/scientificreports/ together with the deleted CXXC domain did not show an interaction in either immunoprecipitants or in cells as assessed with PLA ( Supplementary Information Fig. S6). On the other hand, the remaining of the CXXC5 variants, including the CXXC domain alone, 3F-CXXC, were detectable with the Flag antibody in the HA, but not in the IgG, precipitated lysate. The sequential residues at the domain (threonine, glycine, histidine, and glutamine amino acids, TGHQ) are critical for the binding of the CXXC domain to DNA 23,32,55 . To assess whether DNA binding defective CXXC5 interacts with MeCP2, we generated full-length 3F-CXXC5 DBM and 3F-CXXC DBM mutants by converting the TGHQ sequence to AAAA. 3F-CXXC DBM as the full-length 3F-CXXC5 DBM retained the interaction with MeCP2. These results collectively suggest that the carboxyl-terminal CXXC domain of CXXC5 is the required region to interact with MeCP2 independently of its ability of binding to DNA. It should be noted that although the CXXC domain is the required region of CXXC5 for protein interactions, the central region of CXXC5, aa 101-150, could contribute to the stability/affinity of protein interactions. We observed that the 3F-CXXC5Δ 100-149 variant, similar to the CXXC domain alone (3F-CXXC), consistently showed a lesser degree of interaction with MeCP2 compared to, for example, 3F-CXXC5, 3F-CXXC5Δ 150-199 , 3F-CXXC5Δ 200-249 , or 3F-CXXC5Δ 1-100 .
To examine whether we could also locate a region of MeCP2 involved in the interaction with CXXC5, we generated MeCP2 variants. An initial screening of MeCP2 variants suggested that the carboxyl-terminus of MeCP2 is involved in the interaction with CXXC5. A previous detailed study indicated that the carboxyl-terminus of MeCP2 is required for the interaction with FBP11, the protein product of PRPF40A (Pre-mRNA Processing Factor 40 Homolog A) and HYPC, the protein product of PRPF40B (Pre-mRNA Processing Factor 40 Homolog B) 56 ; as the truncation of the carboxyl-terminal 86 amino acids containing the WW protein interaction domain of MeCP2, which also results from a frameshift mutation present in a group of Rett syndrome patients, abrogates interactions with FBP11 and HYPC. In transiently transfected HEK293 cells, the nuclearly localized (data not shown) the HA, or the Flag (data not shown), tagged MeCP2 variant lacking 86 amino acids from the carboxyl terminus (MeCP2 1-400 ) (Fig. 7a) displays a MM of about 55 KDa (Δ) compared to the full-length (FL) HA-MeCP2 which exhibits an apparent MM of 80 kDa (Fig. 7b). Immunoprecipitation of HA-MeCP2 1-400 , or Flag-MeCP2 1-400 , with the HA antibody from nuclear extracts of HEK293 cells co-synthesizing 3F-CXXC5, or HA-CXXC5, revealed that the interaction of CXXC5 with MeCP2 1-400 decreases dramatically compared to HA-MeCP2 or Flag-MeCP2 (Fig. 7c,d; Supplementary Information Fig. S7c,d). These results suggest that the carboxyl-terminus of MeCP2 is a critical region for interaction with CXXC5 as well. Moreover, the CXXC domain of CXXC5, 3F-CXXC, appears to be sufficient for the interaction of CXXC5 with HA-MeCP2 but not HA-MeCP2 1-400, as the HA antibody specifically immunoprecipitated 3F-CXXC from nuclear extracts of HEK293 cells co-synthesizing HA-MeCP2 Similarly, the immunoprecipitation of 3F-CXXC with HA antibody from cellular extracts of HEK293 cells co-synthesizing HA-EMD or HA-MAZ ( Supplementary Information Fig. S8a,b) further suggests that the CXXC domain of CXXC5 is also a critical region for the interaction with EMD or MAZ. To assess the in cellula interaction of CXXC5 and MeCP2, HEK293 cells grown in coverslips were transiently co-transfected the expression vector bearing the 3F-CXXC5 or HA-MeCP2 cDNA. Cells were fixed, permeabilized, blocked, and probed with the HA and/or the Flag antibody. Cells were then subjected to fluorescent probes for circular DNA amplification. DAPI was used for nuclear staining. The scale bar is 25 µm. (b) Chromatin immunoprecipitation assay (ChIP)-WB. Co-transfected cells were also subjected to ChIP using the Flag antibody or the isotypematched IgG followed by immunoblotting using the HA antibody. The membrane was also re-probed with the Flag antibody. HC and LC indicate the heavy and light chain of IgG. 10% of ChIP was used as input control. www.nature.com/scientificreports/ These observations collectively suggest that the CXXC domain as the DNA binding module of CXXC5 also participates in protein interactions, supporting previous findings that CXXC5, through the CXXC domain, interacts with Dvl1 57 , HDAC1 (Histone deacetylase 1) 12 , and functionally associates with ATM (ATM Serine/ Threonine Kinase) 28 .
Assessing the possible interplay between CXXC5 and MeCP2 in gene expressions. Our findings that CXXC5 and MeCP2 are co-present on chromatin imply an interplay between these proteins that could modulate gene expressions. We previously reported that CXXC5 is involved in the expression of HDAC11 (Histone deacetylase 11), NFKBIZ (NF-kappa-B inhibitor zeta), or IL12A (Interleukin-12 subunit alpha) in MCF7 cells 21 as similarly reported for the IL12A gene in plasmacytoid dendritic cells 23 . To explore whether the alterations in the extent of gene expression is due to the engagement of CXXC5 with the promoter region of HDAC11, NFKBIZ, or IL12A, we initially carried out bioinformatics analyses using the Eukaryotic Promoter Database 58,59 , which contains resources of eukaryotic RNA polymerase II promoters with experimentally defined transcription start sites, to locate the promoter region of HDAC11, NFKBIZ or IL12A. Results indicate that the promoter of HDAC11 or NFKBIZ, as reported previously 60,61 , or IL12A is located within a CpG island (Supplementary Information Fig. S9). We also performed bioinformatics analyses using the Cistrome Data Browser which utilizes publicly available ChIP-chip and ChIP-seq datasets for genome-wide locations of transcription factor binding from various biological resources 62,63 , to assess whether CXXC5 is enriched at the promoter region of HDAC11, NFKBIZ, or IL12A. We found no CXXC5 dataset generated with the use of human resources at the Cistrome Data Browser. However, our analysis of a recent ChIP-Seq dataset carried out with the use of mouse embryonic stem cells 22 suggests, at least in one of two replicates of ChIP-seq results, that CXXC5 may interact with the promoter region of IL12A. Although there is no MeCP2 ChIP-Seq dataset generated with the use of MCF7 cells, analyses with datasets from IMR-90, a human lung fibroblast cell line, and HCT-166 cells derived from human colon carcinoma revealed that MeCP2 could associate with the promoter region of HDAC11, NFKBIZ, or IL12A ( Supplementary  Information Fig. S9).
Based on these analyses, we assessed the possible association of CXXC5 or MeCP2 with the promoter region of HDAC11, NFKBIZ, or IL12A in MCF7 cells. We also used as a negative control Exon10 of CXXC5, which is (c) Transiently transfected HEK293 cells were subjected to ICC using the Flag antibody followed by Alexa Fluor 488 conjugated secondary antibody for visualization with a fluorescence microscope. DAPI was used for the nuclei staining. (d) HEK293 cells were transiently co-transfected with the expression vector bearing cDNA for a 3F-CXXC5 variant and HA-MeCP2. Nuclear extracts (500 ug) were subjected to Co-IP with the HA antibody or the isotype-matched IgG. The precipitates were subjected to SDS-15%PAGE followed by WB using the Flag antibody or the HA antibody. 10% of nuclear extracts was used as input control.  Supplementary Information Fig. S10a).
Our analyses for the simultaneous presence of CXXC5 and MeCP2 with the promoters of genes we tested by sequential-ChIP from cells co-synthesizing 3F-CXXC5 and HA-MeCP2 were inconclusive. However, ChIP-qPCRs with the use of equal aliquots of extracts from cells synthesizing 3F-CXXC5 and HA-MeCP2 with the Flag or the HA antibody suggested the presence of CXXC5 and MeCP2 on the promoter region of HDAC11, NFKBIZ, or IL12A (data not shown), as we similarly observed in cells transfected with 3F-CXXC5 or HA-MeCP2 alone ( Supplementary Information Fig. S10a).
These results imply that an interplay between CXXC5 and MeCP2 could contribute to the expression of HDAC11, NFKBIZ, or IL12A. To test this possibility, we examined the alterations in the expression of HDAC11, (c,d) HEK293 cells were transiently co-transfected with the expression vector bearing cDNA for 3F-CXXC5 and HA-MeCP2 or HA-MeCP2 1-400 . Nuclear extracts were subjected to Co-IP with the HA or the isotype-matched IgG. The precipitates were subjected to WB using the Flag antibody. The membrane was re-probed with the HA antibody. 10% of nuclear extracts was used as input control. Molecular masses (MM) in kDa are indicated. (e) Nuclear extracts of HEK293 cells transiently co-transfected with an expression vector bearing cDNA for the 3F-CXXC domain (3F-CXXC), HA-MeCP2, or HA-MeCP2 1-400 , were subjected to WB using the Flag, HA or HDAC1 antibody. Molecular masses (MM) in kDa are indicated. (f) Nuclear extracts, 500 µg, co-synthesizing CXXC , and HA-MeCP2, or HA-MeCP2 1-400 , were subjected to Co-IP with the HA antibody or the isotype-matched IgG. The precipitates were subjected to WB using the Flag antibody. The membrane was also re-probed with the HA antibody. 10% of nuclear extracts was used as input control. Molecular masses (MM) in kDa are indicated. www.nature.com/scientificreports/ NFKBIZ, or IL12A in MCF7 cells transiently transfected with a control siRNA (CtS), a siRNA specific to CXXC5 (#10), and/or a siRNA pool specific to MeCP2 (Me-siR). Results revealed that CtS did not affect the transcript or protein levels of CXXC5 or MeCP2. #10 specifically repressed the transcript and protein levels of CXXC5 without affecting those of MeCP2 (Fig. 8a,b; Supplementary Information, Fig. S11). Re-probing the membrane with an antibody specific to MeCP2, which also immunoprecipitates endogenous MeCP2 in MCF7 cells ( Supplementary  Information, Fig. S12), indicated that Me-siR specifically suppressed the transcript and protein levels of MeCP2. Co-transfection of #10 and Me-siR led to suppression of the transcript and protein levels of both CXXC5 and MeCP2 (Fig. 8a,b; Supplementary Information, Fig. S11).  www.nature.com/scientificreports/ Repression of CXXC5 and MeCP2 protein levels resulted in the alteration of HDAC11, NFKBIZ, or IL12A expression. Transfection of cells with #10 alone attenuated the expression of HDAC11 and IL12 but augmented the NFKBIZ expression. Me-siR alone, on the other hand, did not affect the expression of HDAC11, IL12A, or NFKBIZ. But when co-transfected, Me-siR counteracted the repressive effect of #10 on HDAC11 expression, prevented and further augmented the expression of IL12A without affecting the transcript levels of NFKBIZ enhanced by #10 (Fig. 8c). To assess whether modulations in gene expressions were due to changes in the extent of MeCP2 interactions with the promoter region of HDAC11, NFKBIZ, or IL12A, we carried out ChIP-qPCR from MCF7 cells transfected with CtS or siRNA#10 for 48 h. Cells were then subjected to ChIP using the MeCP2 antibody or an isotype-matched IgG. Results, depicted as fold changes compared to CtS following normalization to IgG, revealed that the association of MeCP2 with the promoter regions of HDAC11 and IL12A increases but does not change at the NFKBIZ promoter (Fig. 8d) or Exon10 of CXXC5 as the control ( Supplementary  Information, Fig. S10b).
These results collectively suggest that an interplay between CXXC5 and MeCP2 at promoter regions contributes to the modulation of a subset of CXXC5 target gene expressions. We also observed that the methylation state of the promoter region of HDAC11, NFKBIZ, or IL12A is not affected by the reduction of CXXC5 levels ( Supplementary Information Fig. S13b). This implies that modulation of a subset of gene expressions by the integrated effects of CXXC5 and MeCP2 may not require changes in DNA methylation.
Correlation analysis between mRNA expressions of CXXC5 and MeCP2 in breast cancer patients. To assess whether there is a correlation between the CXXC5 and MeCP2 expressions in clinical settings particularly in breast cancer patients that could highlight the importance of our observations, we used the GEPIA (Gene Expression Profiling Interactive Analysis) webserver for the expression analyses of CXXC5 and MeCP2 based on paired normal tissue and tumor samples from the TCGA and healthy tissue samples from the GTEx databases 65 . We found that the mean gene expression profile of CXXC5 is higher in breast tumor samples compared to mammary tissue together with paired normal breast tissue samples ( Supplementary Information Fig. S14a), while the mean expression of the MeCP2 gene does not show significant variations between breast tumor and paired normal tissue samples ( Supplementary Information Fig. S14b). We also found that although CXXC5 expression is significantly deregulated, there is no strong correlation between the CXXC5 and MeCP2 expressions in healthy mammary tissue in paired normal breast samples and breast tumors ( Supplementary Information Fig. S14c-e), neither in many other tissue tumors including acute myeloid leukemia, brain lower grade glioma, glioblastoma multiforme, sarcoma, prostate adenocarcinoma, and testicular germ cell carcinoma (Supplementary Information, Fig. S15).

Discussion
Our findings indicate that the proximity interaction partners of CXXC5 identified with BioID in MCF7 cells encompass proteins involved in DNA structural changes, DNA modifications, chromatin remodeling, chromatin/histone modifications, and RNA processing as well as transcription factors and transcription co-regulatory proteins. Since CXXC5 lacks an intrinsic enzymatic activity or a transcription regulatory function but is a preferential unmethylated CpG dinucleotide binder through its CXXC domain 4,21,66,67 , our results together with the previous studies 12, [22][23][24]26,27,[29][30][31][33][34][35][36] suggest that CXXC5 could act as a molecular scaffold for the regulation of gene expressions leading to cellular proliferation, differentiation, and death in a cell context-dependent manner [11][12][13]15,21,22,24,[28][29][30] . This prediction presupposes the binding of CXXC5 to unmethylated DNA. However, it is also likely that CXXC5 without interacting with DNA associates directly or indirectly as a part of a protein complex with various transcription factors and/or DNA/chromatin binders to modulate gene expressions. Previous studies 12,37 , as we find here, also indicate that the DNA binding feature of the CXXC domain is independent of the ability to interact with protein partners, as the DNA binding CXXC mutants retain interactions with Dvl1 37 or HDAC1 12 . These findings indicate bi-functionality for the CXXC domain of CXXC5: DNA and protein interactions. Moreover, the interaction of CXXC5 with KMT1A (SUV39H1) was reported to occur through a central region (aa 101-200) of CXXC5. This, together with our observation that the efficiency of interactions of the CXXC domain with protein partners could be modulated by a centrally located region (aa 100-150) of CXXC5, suggests that in addition to the CXXC domain, CXXC5 may contain distinct protein interaction regions and/or surfaces emerging from inter/intra-molecular allosteric interactions that could control associations or interaction affinities with protein partners. This, in turn, implies that dynamic conformational fluctuations of CXXC5 with a highly disordered amino-terminus region 21 as results of DNA binding, protein-protein interactions, and/or post-translational modifications in a cellular environment are critical for diverse functional features of the protein in a signal-and cell type-specific manner.
Nuclear lamina (NL) is an interwoven structure composed of lamins and lamin-associated proteins of the inner nuclear membrane (INM). NL and proteins resident in the INM form a dynamic network that regulates chromatin organization, cell cycle regulation, DNA replication, DNA repair, cell differentiation, and death 68,69 . The NL is composed of Lamin A and C, which are the two major splice variants of a single gene (LMNA), as well as Lamin B1 and B2 encoded by LMNB1 and LMNB2, respectively 68,69 . Lamins interact with chromatin either directly or indirectly through chromatin-binding proteins. Lamin B1/B2 interacts with LBR (Lamin B Receptor) and HP1 (heterochromatin protein 1) associated with heterochromatin. On the other hand, Lamin A/C interacts with members of the LEM-Domain family (LEMD) proteins, including LAP2A/B and EMD through a nucleoplasmic adaptor protein BANF1 (barrier-to-autointegration factor, BAF), which binds to histones and DNA associated with both heterochromatin and euchromatin. We observed here that the proximity interaction partners of CXXC5 include EMD and LAP2A. We found that CXXC5 interacts with EMD but not with LAP2A, which resides in the nucleoplasm rather than INM due to the lack of a membrane-spanning domain present  Supplementary Information, Fig. S16). It is therefore plausible that CXXC5 as an unmethylated CpG binder could anchor DNA to the INM through interactions with EMD, thereby providing a local chromatin environment critical for the modulation of target gene expressions in a nuclear activity-dependent manner. Similarly, the interaction of CXXC5 with MAZ, a transcription factor with six Cys2His2-type zinc finger motifs at the carboxyl-terminus that binds to the permutation of the canonical GGG AGG G DNA sequence, could be critical for reciprocal recruitment, or assisted loading, to regulatory sites of target genes. This could result in coordinated recruitment of coregulatory proteins, including epigenetic factors, exemplified with nucleosomeremodeling factor (NURF) subunit BPTF as the proximity interaction partner of CXXC5 (Supplementary Information, Table S1) and the interactor of MAZ 70 for the regulation of target gene expressions.
DNA methylation is one of the mechanisms of gene silencing and it mostly occurs in CpG dinucleotides of the genome. The effect of DNA methylation on gene expression could be refractory and/or permissive, depending on the genomic region. DNA methylation at promoters represses transcription of genes, while DNA methylation of intra/intergenic regions with different degrees of CpG density appears to correlate with gene expression 71 . DNA methylation is intimately associated with histone modifications, leading to integrated epigenetic processes for the alteration of chromatin architecture and the subsequent modulation of gene expressions. Methylated DNA is specifically recognized by MBPs, which belong to three distinct structural families: the Methyl-binding domain (MBD), the Methyl-CpG binding zinc fingers, and the SRA (SET-and RING-associated) domain proteins 72 . As a member of the MBD family, MeCP2 functions as a genome-wide transcriptional modulator. Mutations in the X-linked MeCP2 gene lead to a severe neurodevelopmental disorder, Rett syndrome 73 . MeCP2 readily binds both methylated and unmethylated DNA. MeCP2 through the MBD domain interacts with methylated/hemimethylated CG dinucleotides as well as methylated/hemimethylated CAC tri-nucleotides. MeCP2 also interacts with methylated cytosines in the non-CG context (mCH, where H = A, C, or T) as well as nucleosomes [74][75][76][77][78][79][80][81][82] . The MBD domain of MeCP2 also binds to unmethylated 5'-CAC/GTG-3' motif-containing DNA with an affinity comparable to methylated DNA 74 . In addition, the ID, TRD, and CTD domains of MeCP2 exhibit methylation-independent DNA binding capabilities 83 , and AT-hook-like domains within the ID, TRD, and CTD alpha domains of MeCP2 bind to the minor groove of AT-rich DNA 84 .
These methylation-dependent and independent DNA binding capabilities, together with the ability of MeCP2 to bind to DNA cooperatively and to induce DNA bridging and looping 83,85,86 , are suggested to allow MeCP2 to interact with different sites on DNA simultaneously, thereby contributing to genome-wide chromatin organization 52,87 . Upon binding to DNA/histones at gene regulatory regions, MeCP2 could directly hinder the binding of transcription factors to cognate response elements or indirectly through sequential and ordered recruitment of distinct members of the chromatin remodeling complexes to generate a chromatin state refractory for transcription. Moreover, MeCP2 suppresses transcription by binding to methylated cytosine within transcribed regions of gene bodies thereby impeding transcriptional elongation 88 . Besides transcription attenuation/ repression, MeCP2 also functions as an activator/enhancer of gene expressions by engaging transcriptionally active promoters 81 and recruiting coactivator proteins, including CREB1 (CAMP responsive element binding protein 1) 82 and MYCN (MYCN Proto-Oncogene, BHLH Transcription Factor) 89 .
We found here that CXXC5 interacts with MeCP2, and the interaction involves the carboxyl-terminal CXXC domain of CXXC5 independently of its ability to bind to DNA and the carboxyl-terminus of the WW domain (CTD β domain) of MeCP2. We observed that CXXC5 and MeCP2 are associated with the promoter region, located within a CGI, of HDAC11, IL12A, or NFKBIZ gene. Furthermore, we observed that an interplay between CXXC5 and MeCP2 at HDAC11, IL12A, and NFKBIZ promoter regions is critical for the magnitude of gene expressions independently of DNA methylation. These observations imply that the integrated effects of CXXC5 and MeCP2 are important for the transcriptional output of some of the CXXC5 target genes. However, underlying mechanistic features of these integrated effects remain, at this juncture, speculative. The interaction of the DNA-bound CXXC5 with MeCP2 may lead to the recruitment of chromatin remodelers and/or co-regulatory proteins to fine-tune, augment or repress, the expression of a subset of target genes. One anticipated result would then be a decrease in the association of CXXC5 with DNA when CXXC5 is knockdown should reduce the extent of DNA association of MeCP2. In contrast, we observed an increased presence of MeCP2 at the promoter regions of HDAC11 and IL12A but not at the NFKBIZ gene promoter. This suggests that the binding of CXXC5 to DNA restricts the DNA association of MeCP2 at HDAC11 and IL12A promoters independent of CXXC5 interactions, thereby increasing the repressive potential of MeCP2 for transcription when the CXXC5 levels are reduced. It is, therefore, possible that the DNA-bound CXXC5 independently of MeCP2 is involved in the magnitude of the gene expression through interactions with various chromatin modelers and/or transcription co-regulators or preventing the binding of various transcription factors to cognate response elements at the target gene promoters to establish a transcription state. It is also possible that the recruitment of MeCP2 by the DNA-associated CXXC5 or the interaction of CXXC5 with the DNA/histone-bound MeCP2 constrains the modulatory effects of MeCP2 for gene expressions by sterically blocking the ability of either interaction partner to recruit co-regulatory complexes. This could, when the levels of both proteins are reduced, result in no change, as observed with HDAC11, or in augmentation by causing accessibility to other TFs, as in the case of IL12A, in the level of gene expression. We also observed that the knockdown of CXXC5 leads to an enhanced NFKBIZ expression without affecting the extent of the association of MeCP2 with the NFKBIZ promoter. The repression of MeCP2 levels, on the other hand, did not affect the NFKBIZ expression. This suggests that the DNA-associated CXXC5 recruits, or assists the loading of, MeCP2 to repress the NFKBIZ expression. But we also observed that the expression of NFKBIZ remains elevated when both CXXC5 and MeCP2 protein levels are suppressed. This raises the possibility that the binding of various TFs to exposed cognate binding sites as a result of reduced levels of CXXC5, and thereby the www.nature.com/scientificreports/ associated MeCP2, increases the NFKBIZ expression. Deciphering possibilities involved in target gene expressions could further shed light on the function of CXXC5 in cellular events. In summary, the proximity interaction analysis we employed here indicates that although lacks an intrinsic transcription regulatory activity or an enzymatic function, CXXC5 as an unmethylated CpG binder interacts with various DNA/chromatin modelers and transcription factors/co-regulators and is involved in the modulation of target gene expressions. While constituting a prelude for the identification of interaction partners of CXXC5, our findings here provide a basis for a better understanding of the regulatory mechanisms of CXXC5-mediated cellular processes in response to intrinsic and extrinsic stimuli in a lineage-specific manner in both physiology and pathophysiology that could offer clinical benefits.

Materials and methods
Cell culture and transfection. MCF7 and HEK293 cells were cultured in phenol red-free, high glucose (4.5 g/L) containing Dulbecco's modified Eagle's medium (DMEM, Lonza, Belgium, BE12-917F) supplemented with 10% fetal bovine serum (FBS, Lonza), 1% L-Glutamine (Lonza, BE17-605E) and 1% Penicillin/Streptomycin (Lonza, Belgium) as described previously 19,21,90 . MCF7 or HEK293 cells were transiently transfected with Turbofect transfection reagent (R0533; ThermoFisher, Waltham, MA, USA) for 48 h if not otherwise specified. Protein concentrations in extracts were assessed with a Bradford protein assay kit (Bio-Rad Life Sciences; 5000001). Restriction and DNA modifying enzymes were obtained from New England Bio-Labs (Beverly, MA, USA) or ThermoFisher. Chemicals were obtained from Sigma-Aldrich (Germany) or ThermoFisher. Pageruler Prestained Protein Ladder (ThermoFisher; 26616) or Pageruler Plus Prestained Protein Ladder (ThermoFisher; 26620) was used as the molecular mass (MM) marker. In all PCR-based approaches, at least two distinct primer sets, and their combinations, designed with the PrimerQuest Tool of Integrated DNA Technologies (IDT; https:// www. idtdna. com/ pages/ tools/ prime rquest) were initially used for testing their efficiencies in amplifying template DNA (genomic, cDNA, or plasmid) under various conditions including varying temperatures without or with various DMSO concentrations. Based on PCR results, a primer set giving the best amplification efficiency was used in experiments and reported in Supplementary Information, Table S2.

Generation and functional analyses of protein components of BioID.
To generate protein components of BioID, the 3xFlag-CXXC5 (3F-CXXC5) cDNA obtained with PCR using the wild-type CXXC5 cDNA 19 as the template was genetically fused to the 5' end of a sequence encoding the BirA*-HA cDNA present in the pcDNA expression vector, pcDNA3.1-BirA*(R118G)-HA obtained from Addgene (36047). To generate BirA*(R118G)-HA cDNA with the translation initiation codon encoding methionine within the context of Kozak sequence (CGC CAT G), we used PCR with primers and BirA*(R118G)-HA cDNA as the template. The cDNA was then cloned into the pcDNA3.1 vector and sequenced to ensure the fidelity of the encoding sequences. To assess the synthesis and intracellular location of the protein components of BioID, we carried out western blot (WB) and immunocytochemistry (ICC) analyses in transiently transfected MCF7 cells derived from a breast adenocarcinoma. The expression vector pcDNA3.1 bearing none, the BirA*-HA, the 3F-CXXC5, or the 3F-CXXC5-BirA*-HA cDNA were transiently transfected into MCF7 cells for 24 h. Cells were then treated without or with 50 μM biotin (Sigma-Aldrich; B4639) and 1 mM ATP (Adenosine 5'-triphosphate disodium salt hydrate, Sigma-Aldrich; A2283) for 16 h followed by ICC and WB.
For ICC, MCF7 cells (5 × 10 4 ) grown on coverslips in 12-well tissue culture plates were transiently transfected for 48 h. Cells were then washed with PBS and fixed with 3.7% formaldehyde in PBS for 30 min. The cells were permeabilized with 0.4% Triton-X-100 in PBS for 10 min followed with the incubation containing 10% normal goat serum in PBS for the HA or the Biotin antibody or 10% BSA in PBS for the Flag antibody to block the non-specific antibody binding for 1 h. Cells were then incubated with the HA (Abcam, ab9119; 1:500 dilution), the Flag (Sigma Aldrich, F-1804; 1:250 dilution), or Biotin (Abcam ab53494; 1:100 dilution) antibody in the corresponding blocking buffer for 2 h. Cells were subsequently washed with PBS and incubated with a goat antimouse IgG H&L (Alexa Fluor 488; green-fluorescent dye; ab150113; 1:1000 dilution) for the Flag antibody, a goat anti-rabbit IgG H&L (Alexa Fluor 594; red-fluorescent dye; ab150077; 1: 1000 dilution) for the HA antibody or the Biotin antibody at room temperature for 1 h. The cells were rinsed in PBS and mounted onto glass slides with a mounting medium containing DAPI (4,6-diamidino-2-phenylindole; blue-fluorescent stain) for nuclear staining (Abcam, ab104139). Images were viewed and captured with the Nikon Eclipse 50i Fluorescence Microscope.  To identify a sub-region(s) of CXXC5, we generated cDNAs encoding amino and/or carboxyl-terminally truncated CXXC5 proteins. To ensure that some CXXC5 variant proteins lacking the nuclear localization signal located at the carboxyl-terminus CXXC domain localize to the nucleus as the full-length CXXC5 we inserted sequences generated by PCR using the CXXC5 cDNA as the template to encode an NLS derived from the SV40 T antigen 54 between sequences encoding the Flag epitope and a CXXC5 variant.

Validation of interaction partners of
All constructs were sequenced for the fidelity of encoding sequences. Tag and Primer sequences are given in Supplementary Information , Table S2.
ICC. HEK293 cells (2.5 × 10 4 ) plated on coverslips in a well of 12-well tissue culture plates were grown for 48 h. Cells were then transiently transfected with expression vectors bearing the 3F-CXXC5 (or HA-CXXC5) cDNA alone or together with the HA or 3F tagged EMD, MAZ, or MeCP2 cDNA using Turbofect transfection reagent (Thermo Scientific, R0532). 48 h after transfections, cells were processed for ICC as described above using the Flag and/or HA antibodies followed by Alexa Fluor conjugated secondary antibodies for visualization with a Nikon Eclipse 50i Fluorescence Microscope. ImageJ software was used for image analysis.
WB. HEK293 cells (15 × 10 4 ) plated on six-well tissue culture plates for 48 h were transiently transfected with the expression vector bearing the 3F-CXXC5 (or HA-CXXC5) cDNA alone or together with the HA-or 3F-tagged EMD, MAZ, or MeCP2 cDNA using Turbofect transfection reagent (Thermo Scientific, R0532). Cells were then processed for WB as described above.
Co-Immunoprecipitation (Co-IP). Transiently transfected HEK293 cells in six-well plates were collected with trypsin and lysed with NE-PER (ThermoFisher; 78,833) that contained freshly added protease and phosphatase inhibitors. The protein concentration of lysates was measured by the Bradford Protein Assay. To block nonspecific protein binding to magnetic beads, 500 μg lysates were incubated with non-specific IgG (5 μg) together with 25 μl Protein A and G conjugated magnetic beads at 4 °C for 1 h. Lysates in 1.5 ml centrifuge tubes were then applied to a magnetic field for 30 s to pull the beads to the side of the tube. www.nature.com/scientificreports/ to a clean 1.5 ml microcentrifuge tube and beads were discarded. The pre-cleared lysates were subsequently incubated with a 5 µg HA or Flag antibody at 4 °C overnight and followed by the addition of 25 For ChIP-WB, immunoprecipitates following washes were directly dissolved in 40 µl 6XLaemmli buffer (375 mM Tris-HCl pH 6.8, 6% SDS, 4.8% Glycerol, 9% 2-Mercaptoethanol, 0.03% Bromophenol blue) and were boiled for 10 min. Beads were removed with a magnetic stand and the supernatants were subjected to SDS-8%PAGE followed by WB.
For ChIP-qPCR, immunoprecipitates after washes were resuspended with a ChIP elution buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA) followed by the addition of NaCl (to the final concentration of 300 mM) to de-crosslink protein-DNA interactions. Samples were then incubated at 37 °C for 1 h for RNase treatment and followed by incubation at 65 °C for 4 h with Proteinase K treatment (10 mg/ml). DNA was then recovered with phenol:chloroform:isoamyl alcohol (25:24:1) followed by ethanol precipitation of DNA for qPCR.We also carried out ChIP for endogenous MeCP2 using a MeCP2-specific antibody (Proteintech Group, Inc., Rosemont, IL, USA; 10861-1-AP) in MCF7 cells transiently transfected with 10 nM of a scrambled siRNA (AllStar, CtS) for 48 h. For ChIP-WB, samples from ChIP carried out as described above were resuspended in 40 µl of 2xLaemmli-SDS buffer and incubated at 95 °C for 10 min. Supernatants were subjected to SDS-8%PAGE for WB analysis using the HA antibody. For ChIP-qPCR, following de-crosslinking and protein digestion, DNA was recovered with phenol:chloroform:isoamyl alcohol (25:24:1) followed by ethanol precipitation and was subjected to qPCR using primers specific for the promoter region of HDAC11, NFKBIZ, IL12A, or the Exon10 of CXXC5 as a negative control ( Supplementary Information, Table S2).
qPCR results were normalized using percent (%) of input approach 90 and depicted as fold changes compared to CtS following normalization to IgG.
siRNA transfections and RT-qPCRs. For siRNA transfection, MCF7 cells in 6-well tissue culture plates were transiently transfected with the HiPerfect transfection reagent (Qiagen) using 10 nM a scrambled siRNA (AllStar, CtS), a siRNA specific for CXXC5 (siRNA#10; FlexiTube GeneSolution, Qiagen), as we described previously 19,21 , and/or a siRNA pool specific to MeCP2 (sc-35892, SCBT). To equalize the total amount of siRNA (20 nM) used in co-transfection experiments, 10 nM gene-specific siRNA was used together with 10 nM CtS. Isolated total RNA was used for the cDNA synthesis (The RevertAid First Strand cDNA Synthesis Kit, ThermoFisher www.nature.com/scientificreports/ SYBR Green Mastermix (BioRad, Hercules, CA, USA) and gene-specific primers (Supplementary Information, Table S2) were used for qPCR reactions on BioRad Connect Real-Time PCR. For the normalization of results, we used the expression of RPLP0 (60S acidic ribosomal protein P0), as we described previously 95 . The relative quantification of gene expressions was assessed with the comparative 2 -ΔΔCT method 96 . For qPCR experiments, MIQE Guidelines were followed 97 .
Bisulfite PCR. To assess the DNA methylation state, we used bisulfite DNA sequencing. CtS siRNA or siRNA#10 transfected MCF7 cells for 48 h were subjected to genomic DNA isolation by using QIAamp DNA Mini Kit (Qiagen, 51304) according to the manufacturer's protocol. Bisulfite conversion of 500 ng of isolated gDNA was performed with EZ-DNA Methylation Lightning Kit (Zymo Research, D5030). The bisulfite converted DNA was used as the template for PCR reaction (LongAmp Taq Polymerase, NEB, M0323) using bisulfite converted DNA specific primers designed with MethylViewer 98 . Amplicons were cloned into the pGEM-T vector (Promega, A3600) for sequencing. Sequences were analyzed with the QUMA 99 tool (http:// quma. cdb. riken. jp/).
ChIP-seq data analysis. To investigate DNA regions that CXXC5 could interact with, data from a previously carried out CXXC5 ChIP-seq experiment were analyzed 22 . The analyses were carried out with publicly available bioinformatics tools available on the Cancer Genomics Cloud (CGC) (Seven Bridges Genomics, Boston, USA). Briefly, the raw sequencing reads of ChIP-seq experiments (GEO accession: GSE132025) were retrieved from the NCBI Gene Expression Omnibus (GEO) database as sequence read archive (SRA) files. The SRA files were first converted to FASTQ format using the SRA Toolkit fastq-dump tool. The sequenced reads in FASTQ format were aligned on the mouse reference genome version 10 (mm10) using the Burrows-Wheeler Aligner (BWA) bwa-backtrack algorithm specialized for short reads 96 and the peaks were called using the MACS2 tool version 2.1.1 100 . Both tools are available on the CGC platform as a workflow. We used the default parameters for both tools and used the broad peak calling functionality of MACS2 to identify binding regions.
Correlation analysis between mRNA expressions of CXXC5 and MeCP2 in breast cancer patients. To assess the possible correlation between the CXXC5 and MeCP2 expressions, we used the GEPIA (Gene Expression Profiling Interactive Analysis) webserver 65 for the expression analysis of the CXXC5 and MeCP2 genes based on paired normal tissue and tumor tissue samples from the TCGA (https:// www. cancer. gov/ tcga) and healthy breast tissue samples from the GTEx 101 databases. The gene expression profiles of CXXC5 and MeCP2 across all tumor samples and paired normal tissues as well as the correlation between mRNA expressions of CXXC5 and MeCP2 in normal and breast tumor samples were analyzed.

Statistical analysis.
Experiments were repeated at least two independent times. Results, where and when appropriate, were presented as the mean ± standard error (S.E.) of three biological replicates. Statistical analyses were performed using a two-tailed unpaired t-test with a confidence interval, minimum, of 95%.