Transcriptomic profile of Pea3 family members reveal regulatory codes for axon outgrowth and neuronal connection specificity

PEA3 transcription factor subfamily is present in a variety of tissues with branching morphogenesis, and play a particularly significant role in neural circuit formation and specificity. Many target genes in axon guidance and cell–cell adhesion pathways have been identified for Pea3 transcription factor (but not for Erm or Er81); however it was not so far clear whether all Pea3 subfamily members regulate same target genes, or whether there are unique targets for each subfamily member that help explain the exclusivity and specificity of these proteins in neuronal circuit formation. In this study, using transcriptomics and qPCR analyses in SH-SY5Y neuroblastoma cells, hypothalamic and hippocampal cell line, we have identified cell type-specific and subfamily member-specific targets for PEA3 transcription factor subfamily. While Pea3 upregulates transcription of Sema3D and represses Sema5B, for example, Erm and Er81 upregulate Sema5A and Er81 regulates Unc5C and Sema4G while repressing EFNB3 in SH-SY5Y neuroblastoma cells. We furthermore present a molecular model of how unique sites within the ETS domain of each family member can help recognize specific target motifs. Such cell-context and member-specific combinatorial expression profiles help identify cell–cell and cell-extracellular matrix communication networks and how they establish specific connections.

. Schematic comparison of Pea3 family members' domain structure and primary sequence of a region encompassing the ETS domain and the C terminal transactivation domain. Upper panel: Pea3 family members Er81/ETV1, Pea3/ETV4, and Erm/ETV5 have their DNA binding domain, the ETS domain, towards the C terminal (red box); all three members also contain two transactivation domains on each terminus (grey and blue boxes). Lower panel: Primary sequence of the region encompassing ETS domain (red font) and C terminal transactivation domain (blue font) across species (r rat, m mouse, h human); the ETS domain is well-conserved and the key residues that show variability among family members and species are indicated by green (those specific for ERM), yellow (those specific for Er81) font color.

Results
Microarray-based target identification for each family member Pea3, Erm and Er81. It has been previously reported in the literature that subsets of motor neurons and muscle sensory afferents express Pea3 or Er81 almost exclusively and that functionally interconnected sensory and motor neurons express the same ETS variant 27 . It was intriguing that in this study motor and sensory neurons innervating the same target muscle expressed different Pea3 proteins, almost exclusively 27 . It was later shown that in Pea3 mutant mice, motor neurons failed to innervate their target muscles, and the cell bodies of these neurons were mispositioned 9 , and that this function was through GDNF and Met signaling 28,29 . Recently, a transcriptomics study has revealed neuronal targets of the PEA3 family using a constitutively active Pea3-VP16 13 . Yet, it has not been clear how the specificity of functional motor neuron connectivity can be achieved through the expression of specific Pea3 family members that have such high homology within their DNA binding domain.
We have addressed this question through microarray analysis of two different model systems-SH-SY5Y neuroblastoma cells and mHypoA2/12 hypothalamic cell lines-that have been transfected to overexpress Pea3, Erm and Er81 proteins. The statistical analyses of transcriptome profiles yielded a set of differentially expressed genes (DEGs) and the high performance of the differential expression levels in discriminating the transfected cells from control cells was verified through Principal Component Analysis, as exemplified for the mHypoA2/12 transfected with Pea3-VP16 expression vector ( Fig. 2A,B). We observed a modest number of DEGs under Pea3, Erm and Er81 overexpression (286, 216 and 989 genes, respectively), unlike the previous study where Pea3-VP16 was overexpressed in SH-SY5Y cells 13 , whereas significantly more genes have been responsive to Pea3, Erm and Er81 overexpression in hypothalamic mHypoA2/12 cells (5482, 2036 and 1580 genes, respectively; Fig. 2C). When mHypoA2/12 cells were transfected with Pea3-VP16, the number of DEGs was 6590 (Fig. 2C).
According to the gene set enrichment analyses as schematized in the circos plot, the nervous system-related pathways such as neuron development, neurogenesis, brain development, and axon guidance, among many others, were significantly enriched in Pea3-, Erm-and Er81-transfected SH-SY5Y (Fig. 2D, Table 1) as well as mHypoA2/12 cells (Fig. 2D, Table 2; see Supplemental Table S1 for full list). As can be seen in this plot, majority of the genes affected fall into neurogenesis and neuron differentiation pathways in both SH-SY5Y and mHypoA-2/12 cells transfected with Pea3, Erm and Er81; axon guidance pathway was also affected in both cell types, and additionally genes related to brain development, neural crest cell development and neural tube development were also regulated in SH-SY5Y cells transfected with Pea3 and Er81 (Fig. 2D). Although similar pathways were regulated to a similar extent in all cell types analyzed, some cell type-specific differences observed in target genes are expected to be due to different transcriptional partners in cells of different origin (SH-SY5Y is a neuroblastoma cell line used as a neuronal differentiation model, while mHypoA-2/12 is an immortalized hypothalamic cell line). Selected genes from these pathways have been chosen for further analysis, focusing on a subset of genes that are known regulators of axon growth and guidance, cell migration, synaptic connectivity, and dendritic morphology within Neurotrophin signaling & development and Axon Guidance pathways. qPCR validation of selected targets in SH-SY5Y, mHypoA2/12, and mHippoE-14 cell lines. Functionally interconnected neurons expressing the same transcription factor can only recognize each other through their cell surface proteins or secreted molecules. We have therefore concentrated on growth factors, growth factor receptors, chemoattractant, and chemorepellent molecules, as well as regulators thereof, in our qPCR validation assays. To that end, we have transfected mHypoA2/12 cells with expression vectors for Pea3, Erm, or Er81, and analyzed the expression of genes identified in microarray studies.
Among the Neurotrophin signaling & development pathway, Fgfr1 was downregulated by Pea3, but upregulated by Erm and Er81, while Egfr did not show a significant change (Fig. 3, upper panel). Vegfa was repressed by Pea3, but not significantly changed in Erm or Er81 overexpression, whereas Gdnf was repressed by Pea3 and activated over twofold by Erm and over threefold by Er81 (Fig. 3, upper panel). Bdnf, on the other hand, was repressed by Pea3 and activated by Erm, but not significantly changed by Er81, while Ntf3 was repressed by Erm and not affected by others and Wnt5a expression was repressed by Er81 (Fig. 3,  www.nature.com/scientificreports/ related to Neurotrophin signaling, proliferation and apoptosis-associated genes were also found: regulation of pro-apoptotic gene Bad or glycogen synthase kinase 3 beta, Gsk3b, by either family members was not validated, however phospholipase C beta (plcb3) downstream of G protein-coupled receptor signaling was repressed by Er81, while Prkaca, the catalytic subunit of protein kinase A, was repressed by Pea3 and upregulated almost fourfold by both Erm and Er81 (Fig. 3, upper panel). Semaphorins, some of which had previously been identified as a target for Pea3-VP16, also showed differential regulation by family members: Sema5b was repressed by Er81 but not others, while Sema3d was repressed by both Erm and Er81; Sema7a was repressed by both Pea3 and Erm, while Sema3g was upregulated by both   2 × 10 -3 14  BMPR1A,DLX2,ISL1,MED1,SYT4,ARHGEF2,ADGRG1,RAPGEF2,SE  MA6C,PLK2,SEMA5B,ITM2C,HOOK3,SEMA3D   GO:0048666  Neuron development  7.5 × 10 -3 33   AGER,ADGRB1,BCL2,CTF1,ETV4,MNX1,ISL1,LHX1,LYN,MYH10,N  EFM,NPTX1,PBX3,KLK6,SIAH2,SKIL,SYT4,LHX3, USP9X,RAPGEF2  ,SEMA6C,PLK2,IFT27,LZTS1,NEDD4L,SEMA5B,CLMN,ITM2C,  Among genes selected within the Axon Guidance pathway, Spred1, a member of the Sprouty family of proteins that interact with tyrosine kinases and inhibit growth factor-mediated activation of MAPKs, was repressed by Pea3 but not altered by Erm or Er81 (Fig. 3, lower panel). Src kinase known to regulate cytoskeletal organization, critical for migration and axon outgrowth, is upregulated by Erm and Er81 but not Pea3, whereas Rock2 kinase, which is involved in actin cytoskeleton organization and neurite retraction, was repressed by both Pea3 and Er81 but not Erm (Fig. 3, lower panel). Cell adhesion molecule L1cam was activated by Erm but not others, and neuregulin 1, Nrg1, that is a ligand for EGF Receptors ERBB3 and ERBB4, was repressed by Pea3 but not others (Fig. 3, lower panel).
There was also differential regulation among ephrins, ephrin receptors, and semaphorins: Epha2 and Epha4 were upregulated by Erm but not others, while Efnb3 was repressed by Pea3 but upregulated nearly fourfold by Erm (Fig. 3, lower panel). Sema3a was repressed by Er81, and Sema3e was repressed by both Erm and Er81, while Sema3c was upregulated by Erm and Er81; Unc5c was repressed by both Pea3 and Erm (Fig. 3, lower panel). Some were regulated by all three proteins: Epha1, Efna2, and Efna3 were upregulated by all Pea3, Erm and Er81. Irs1, Efna1, and Efna4 were not validated as genuine targets of this ETS subfamily in mHypoA2/12 cells.
Similar validation was repeated for SH-SY5Y cells overexpressing Pea3, Erm, and Er81 proteins, covering a similar subset of genes. BAD, FGFR1, VEGFA, EPHA7, and SEMA7A among the Neurotrophin signaling & development pathway could not be validated by qPCR. BDNF and PRKACA were regulated by Pea3 and Er81, and another ETS gene, ETS1 was upregulated by all three Pea3 family members (Fig. 4, upper panel). Semaphorin receptor, PLXNB2, was upregulated by Pea3 and Erm, SEMA6D was upregulated by Pea3 and Er81, and SEMA3D, SEMA3G, and SEMA5B were upregulated by Er81 only (Fig. 4, upper panel).
Finally, we have checked a third cell type, mHippoE-14 hippocampal cell line, for the regulation of these putative targets identified by microarray analysis. Among the significantly regulated genes in Neurotrophin signaling & development pathway, Fgfr1 and Egfr1 were upregulated by both Erm and Er81, while Ntf3, Sema3g, and Sema7a were regulated by Pea3 only, Sema6c was regulated by Er81 only, and Sema3d and Epha7 were regulated by Erm only (Fig. 5, upper panel). Among the significantly regulated genes in Axon guidance pathway, Epha2 and Epha4 were regulated by all three family members; Epha1, Ephb2, Sema3e, Sema3f., and Nrg1 were regulated by both Pea3 and Erm; L1cam and Unc5c were regulated by both Erm and Er81; Sema4f. was regulated by Erm only, and Efna3 was regulated by Er81 only (Fig. 5, lower panel). Differential regulation of these selected genes by different Pea3 family members in SH-SY5Y cells and mHy-poA2/12 cells in microarray vs qPCR analyses have been summarized in Supplemental Tables S2 and S3.

Discussion
The role of Pea3 family members in neuronal connectivity has been largely studied in spinal motor neuron subsets, where researchers have shown that an intrinsic program of ETS expression pattern, in particular, that of Pea3 and Er81, coordinates motor neuron cell body positioning as well as terminal arborization 9 . It was furthermore shown that functionally interconnected motor and sensory neurons that innervate the same muscle expressed-almost exclusively-either Pea3 or Er81 27 , indicating that the specific family members somehow instruct the neuron where to migrate to and which target to connect with. It should be noted, however, that this does not mean Pea3 family members are the only transcription factors involved in target selectivity and neuronal circuit formation, as axonal elongation, target identification and circuit formation are complex and multi-step processes, involving various signal transduction and axonal guidance pathways, gap junctions and other cell-cell communication molecules. It has been shown, however, that for example Er81-sensory neurons do not innervate Er81 + motor neurons and there appears to be exclusive expression of Pea3 family members in different motor neuron subgroups 27 ; furthermore, the function of Pea3 family members is not limited to the spinal motor neurons and sensory neurons: Pea3 family members Pea3 and Erm have been shown to play a role in dendritic arborization in a Bdnf-dependent manner 18 ; Er81 and CaMKIV together have been shown to be important for dopaminergic determination during the migration of progenitors arising from the olfactory bulb; www.nature.com/scientificreports/ Er81 was found to be expressed in ventricular zone at E15 and in prospective layer V neurons projecting to the spinal cord 30 , however, whether it is specifically required to distinguish targets and achieve specific connectivity is yet to be studied. A transcriptomic study using constitutively expressed Pea3-VP16 had identified a large set of neuronal-related pathways and genes regulated by Pea3 13 , but a thorough analysis of the unique targets of each Pea3 family member has been elucidated for the first time in this study. We have identified targets in five categories: (a) those regulated by all three family members, (b) those regulated by Pea3 and Er81, (c) those regulated by Pea3 and Erm, (d)  www.nature.com/scientificreports/ those regulated by Erm and Er81, and (e) those unique for each family member (supplemental Tables S1 and S5). These were found to be largely overlapping between SH-SY5Y and mHypoA2/12 cells, although a larger set of genes within these pathways were differentially regulated in mHypoA2/12 cells (Supplemental Table S1). When genes coding for cell surface or secreted proteins were specifically considered, it was evident that particular combinations of ligands and receptors or cell-cell and cell-extracellular matrix interacting proteins were differentially regulated by individual Pea3 family members or pairs of Pea3 proteins. Wnt5a, which binds frizzled receptor FZD4 and promotes dendrite development and dendritic morphogenesis 31 , is repressed by Pea3 and Er81, but not Erm (Fig. 7a). Sema5b, which is shown to be important for synapse elimination in hippocampal neurons 32 , is regulated by Er81 in both SH-SY5Y and mHypoA2/12 in opposite directions (Fig. 7a). Sema3f., on the other hand, which is critical for limbic system circuitry development 33 , was regulated by Pea3 and Erm in mHippoE-14 and SH-SY5Y cells (Fig. 7a). PlexinB2 (Plxnb2) that has been reported to be a receptor for semaphorins 4a, 4c, 4d and 4g in the context of neuronal axon guidance and cell migration events, is regulated by both Pea3 and Erm (Fig. 7a). We propose that by differentially regulating either the ligands, receptors, or both in corresponding cell types, Pea3 family members ensure that the cells expressing them recognize other cells that express the same Pea3 family member through a "barcode" of cell surface and secreted proteins. We believe this study is a first step towards deciphering this transcriptional code, although, in order to get a more complete picture to analyze motor neuron and other circuitries, more detailed single-cell RNAseq analyses with each neuron in the circuit of interest should be conducted in the future.
Such differential targeting of promoters by each family member can help explain the selectivity of the circuit components; however, it was still unclear how proteins with such highly similar DNA binding domains could distinguish between promoters to ensure this transcriptional barcode. To that end, we have reverted to the five categories of target genes (Supplemental Tables S4 and S6) and addressed whether the small variations within the ETS domain ( Fig. 1) corresponded to these categories. When the 5 residues that make up the ETS domain variations among family members were aligned (numbering given only for Pea3/ETV4), a certain profile became evident: Residue corresponding to A326 in Pea3 was changed to Serine in Erm and Er81; similarly residues corresponding to L376 and E419 in Pea3 were Arginine and Aspartate, respectively, in both Erm and Er81, indicating those residues may account for target recognition by both Erm and Er81 vs unique targets of Pea3 (Fig. 7b, upper panel). On the other hand, residue corresponding to A345 in Pea3 was again Alanine in Er81, but changed to a Threonine in Erm, indicating this particular residue may be important for common target recognition by Pea3 and Er81, and unique sites for Erm (Fig. 7b, upper panel). Finally, residue corresponding to T351 in Pea3 sequence was converted to Alanine in Erm, but was changed to Alanine in human Er81 and Serine in mouse and rat Er81, indicating not only that this may account for unique sites for Erm, Er81, and Pea3 in most mouse, rat, and human, but also for differential gene regulation by Er81 in human vs rodent systems, which is not observed in any other family member (Fig. 7b, upper panel).
To correlate these findings to DNA binding of their ETS domain structures, we have then carried out molecular modeling, where structures of Etv1 and Etv5 without DNA were aligned to the structure of Etv4 with DNA using PyMol software, as described in Materials and Methods. We have then mapped the residues identified in the ETS domain onto this superimposed structure (Fig. 7b, lower panel). The residues corresponding to A326 and L376 in Pea3 were found to be in close proximity to the cognate DNA, indicating that these two residues may specifically interact with Pea3 binding sites containing GGAA/T core motif on target promoters, and account for the differential target recognition by Erm and Er81 vs Pea3 (Fig. 7b, lower panel). The remaining residues corresponding to A345, T351, and E419 on Pea3, however, all resided away from the DNA, indicating that differential target recognition by Er81 and Pea3, or unique target recognition by each family member, may not be through selective binding to promoters, but rather through selective protein-protein interaction that may indirectly affect recruitment and/or activation of those specific targets (Fig. 7b, lower panel).
In this study, we have presented a global transcriptomic study to identify novel targets of Pea3 family members so as to explain how such highly homologous transcription factors could affect selective circuit formation during development. We have also shown that target recognition is dependent on the cellular context, using SH-SY5Y, hypothalamic mHypoA2/12 and hippocampal mHippoE-14 cells, which showed that there is likely a transcriptional barcode that affects the distribution of cell surface proteins, and ligand-receptor pairs on different neurons could help identify the correct and specific target for each neuron. We have furthermore presented a model whereby the small variations within the DNA binding domain may account for the differential regulation of the target promoters identified. It will also be interesting in the future to study whether a similar transcriptional barcode is present for target identification in other tissue types where Pea3 family proteins are involved in branching morphogenesis, such as kidney and lung. After hybridization, slides were scanned using NimbleScan 2.5 software and using three arrays from pCDNA3transfected cell as reference samples, statistical analyses were performed under Bioconductor (https ://bioco nduct or.org/) software platform (version 3.5.0) in R. Agilent microarray data were processed using the Agilent expression array processing package (agilp, version 3.8.0) 34 and normalized by the quantile normalization method. Microarray data of pCDNA3 (as control), Pea3, Erm, Er81, and Pea3-VP16 obtained from mHypoA-2/12 and SH-SY5Y cells were statistically analyzed to identify DEGs using Linear Models for Microarray Data (LIMMA) package (version 3.32.10) 35 . The False Discovery Rate was controlled through Benjamini-Hochberg's method. The regulatory pattern of each gene (i.e., down-or up-regulation) was determined by fold changes. Raw data were submitted to EBI ArrayExpress, accession E-MTAB-8473 and E-MTAB-8475.

Materials and methods
The discrimination performance of the differential expression levels in each condition was verified through the clustering of cells using Principal Component Analysis. The first two principal components explaining at least 80% of total variance were considered in the determination of the performance.
The gene set enrichment analysis was carried out for all DEGs through DAVID 36 and ConsensusPathDB 37 annotation tools. In the enrichment analyses, the KEGG 38 , Reactome 39 , and Biocarta 40 were preferably used as the pathway databases. GO terminology 41 was employed as the source for annotating the molecular functions and biological processes. P-values were obtained via Fisher's Exact Test. Benjamini-Hochberg's correction was used as the multiple testing correction technique, and gene set enrichment results with adjusted-p ≤ 0.05 were considered statistically significant. cDNA synthesis and qPCR. 1 μg of each RNA was used to cDNA synthesis reaction using iScript cDNA Synthesis kit (1708896, BioRad) as per manufacturer's instructions. Briefly, 4 μl of 5 × iScript Reaction Mix, 1 μl of iScript Reverse Transcriptase, 10 μl of RNA (100 ng/μl) and 5 μl of nuclease-free water. In thermal cycler, samples were incubated respectively at 25 °C for 5 min, 46 °C for 20 min and 95 °C for 1 min.  TTC CAG CAT CTG TTG GGG AG  CTC ACC TGG TGG AAC ATT GTG   Efna2  CTA CAT CTC TGC CAC GCC TC  CTG GTG AAG ATG GGC TCA GG   Epha2  TAT GGC AAA GGG TGG GAC CT  TCT CCT CGG TAC ACC CAG TT   Epha4  CCT TAT TGG ATT CCA GAT CTG TTC A ACT CAC TTC CTC CCA CCC TC   Fgfr1  GGC AGT GAC ACC ACC TAC TT  GAG CTA CGG GGT TTG GTT TG   hACTB  ACG AAA CTA CCT TCA ACT CC  GAT CTT GAT CTT CAT TGT  www.nature.com/scientificreports/ 25 ng cDNA template in 10 µl reaction with SsoAdvanced universal SYBR Green supermix (172-5271) was used for Real-Time PCR and carried out using a StepOne Real-Time PCR detection system (forward and reverse primer sets are listed in Table 3). mRNA expression was normalized to β-actin and GAPDH expression. To evaluate whether the difference in gene expression level between control and transfected cells was significant, the efficiency-corrected delta cycle threshold (ΔCt) method was used according to the formula: The RQ values thus calculated were then transformed on a log2 scale to achieve normal distribution of the data and the resulting distributions were tested against the null-hypothesis of equal mRNA level in control and transfected cells (i.e., a population mean of 0.0) using two-tailed one-sample Student's t-tests. A confidence level of α ≤ 0.05 was applied for all comparisons to determine statistical significance. After transfection of Pea3, Erm and Er81 into cells, qPCR was used to determine overexpression levels in transfected cells, and it was observed that the fold changes of Pea3, Erm and Er81 in each cell type was not significantly different except for SH-SY5Y cells, where Erm expression was slightly less than Pea3 or Er81 in transfected cells (Supplemental Fig. S1).  GGG TAT GGA ATC CTG TGG  TGG CAT AGA GGT CTT TAC GG   mBad  CTT CAA GGG ACT TCC TCG CC  CCC AAG TTT CGA TCC CAC CA   mEfna1  CCT CTC TTG GGT CTG TGC TG  GTC CTC CTC ACG GAA CTT GG   mEfna4  CCC TTT CAG CCC TGT TCG AT  GAG TCG GCA CCG AGA TGT AG   mEfnb3  GCT GCT GTT AGG TTT TGC GG  CCT GGA ACC TCT TAT TCG CCG   mEgfr  TGG GTG GCC TCC TCT TCA TA  AGG TTC CAC GAG CTC TCT   ch/searc h_EPDne w.php), and analyzed with PROMO 3.0 to determine the putative binding site sequences of Pea3 on the promoter sequences and their dissimilarity rates, which indicate the variance between DNA-binding sequence of transcription factor and the nucleotide sequence on the selected promoter as percentage by regarding the binding matrices 42 . Thus, the higher possibility for Pea3 binding is stated as the smaller dissimilarity rates (0% dissimilarity rate indicates 100% identity to the consensus motif). Furthermore, genomic sequences are scanned and used to convert into Position Weight Matrices (PWMs) via JASPAR (https ://jaspa r.gener eg.net/) to determine the binding score and their statistical significance (For evaluation, the binding score was taken to be p < 0.01 (JASPAR) and ≤ 13% (PROMO).

Chromatin immunoprecipitation.
Chromatin immunoprecipitation was performed according to a protocol described previously 13 with some modifications. Briefly, SH-SY5Y neuroblastoma cells were transfected with either empty pCMV-6-tag flag or pCMV-Pea3-flag expression plasmid. 48 h after transfection, cells were cross-linked with 1% formaldehyde, and then lysed in lysis buffer. For chromatin shearing, the lysates were sonicated using Bioruptor Pico sonication device (Diagenode) in nuclei isolation buffer. 10% v/v of the sheared DNA was saved as input. The rest of samples was precipitated with anti-Flag M2 affinity resin (Sigma, A2220) or normal mouse IgG (Santa Cruz, sc-2025) overnight. After overnight incubation, immunoprecipitated chromatins were eluted in elution buffer. Crosslinking of DNA and protein was reversed with heat treatment in high salt solution, and then treated with RNase and proteinase K. DNA in the samples was purified using MEGAquick-spinTM Total Fragment DNA Purification Kit (Intron). Purified DNAs (input and ChIP samples) were detected by qPCR using SsoAdvanced universal SYBR Green supermix (BioRad). MMP2 and MMP9 promoter regions were used as positive controls, and FGFR1 intron region as a negative control. Primers are listed in Table 4. ChIP-qPCR data was analyzed as described previously 13 .