The LINC01119-SOCS5 axis as a critical theranostic in triple-negative breast cancer

The development of triple-negative breast cancer (TNBC) is critically regulated by certain tumor-microenvironment-associated cells called mesenchymal stem/stromal cells (MSCs), which we and others have shown promote TNBC progression by activating pro-malignant signaling in neighboring cancer cells. Characterization of these cascades would better our understanding of TNBC biology and bring about therapeutics that eliminate the morbidity and mortality associated with advanced disease. Here, we focused on the emerging class of RNAs called long non-coding RNAs or lncRNAs and utilized a MSC-supported TNBC progression model to identify specific family members of functional relevance to TNBC pathogenesis. Indeed, although some have been described to play functional roles in TNBC, activities of lncRNAs as mediators of tumor-microenvironment-driven TNBC development remain to be fully explored. We report that MSCs stimulate robust expression of LINC01119 in TNBC cells, which in turn induces suppressor of cytokine signaling 5 (SOCS5), leading to accelerated cancer cell growth and tumorigenesis. We show that LINC01119 and SOCS5 exhibit tight correlation across multiple breast cancer gene sets and that they are highly enriched in TNBC patient cohorts. Importantly, we present evidence that the LINC01119-SOCS5 axis represents a powerful prognostic indicator of adverse outcomes in TNBC patients, and demonstrate that its repression severely impairs cancer cell growth. Altogether, our findings identify LINC01119 as a major driver of TNBC development and delineate critical non-coding RNA theranostics of potential translational utility in the management of advanced TNBC, a class of tumors in most need of effective and targeted therapy.


INTRODUCTION
Breast cancer is one of the world's leading diagnosed cancers with >2 million new cases identified across 20 geographical regions in 2018, including >300,000 in the US alone 1 . Although scientific and technological advances have allowed for better clinical management of the disease with increasingly ameliorated patient survival odds, breast-cancer-related mortality rates remain the secondhighest of all cancers worldwide and are estimated at >40,000 US deaths in 2019 2 . These numbers highlight the pressing clinical need for continued research into breast cancer etiology, pathogenesis, and treatment.
Based on molecular and pathological determinations, breast cancer is divided into three main subclasses with differing clinical management and prognostic assessments: the estrogen receptor (ER) and progesterone receptor (PR)-positive tumors (called luminal subtypes), the human epidermal growth factor receptor 2 (HER2 or ERBB2) enriched tumors (called HER2), and the ERnegative, PR-negative, and HER2-negative tumors, also called triple-negative breast cancers (TNBCs) [3][4][5][6] . Among these three classes, patients identified with TNBC exhibit the worst overall outcome 7 . This is due to several parameters that include early onset of TNBC (which tends to occur in younger women), its preponderance in underprivileged ethnic/racial groups with limited access to healthcare, and an aggressive disease pathology with a predisposition for early metastasis 2,8,9 . In addition, TNBCs lack targeted therapies, and chemotherapeutic agents, such as taxanes or anthracyclines, which form the backbone of their clinical management 2 are insufficient in controlling the disease in the majority of TNBC cases, especially when it spreads 10 . Increased molecular understanding of TNBC biology thus stands to provide new avenues for more effective and less toxic medicaments that can eliminate the morbidity and mortality associated with advanced and refractory patients.
TNBC development is tightly regulated by certain tumormicroenvironment-associated fibroblastoid cells called mesenchymal stem/stromal cells (MSCs) that others and we have shown play determining roles in exacerbating tumor malignancy [11][12][13] . While the mechanistic details of such influences have not been cataloged in their entirety, what is clear is that direct heterotypic interactions between MSCs and cancer cells represent the major driving force underlying MSC pro-malignant functions 14 . In these findings, physical MSC-cancer-cell contacts initiate complex networks of both coding and non-coding RNA mediators that work in singular and/or in tandem to trigger critical programs within cancer cells, such as epithelial-mesenchymal transition 15 or cancer stem cell-ness 16 , leading to disease growth, spread, and therapy resistance 17 . These and many other similar findings 14 emphasize the critical roles MSCs play in breast cancer in general and in TNBC in particular and highlight the utility of the MSC-supported tumor progression model in equally delineating disease regulators and vulnerabilities.
We most recently leveraged the MSC-TNBC-cell co-culture model to gain further mechanistic insights into the molecular mechanisms that promote TNBC development. Specifically, we focused on investigating the involvement of a particular family of regulatory non-coding RNAs-the long non-coding RNAs (lncRNAs)-as drivers of TNBC pathology. Indeed, lncRNAs represent a family of >200 nucleotide-long transcribed RNAs with described roles in primarily regulating gene expression in normal physiology and in disease settings, including cancer 18,19 . Whereas activities of individual lncRNAs of relevance to TNBC biology are being identified at a fast pace 20 , functions for the overwhelming members of the family are still outstanding. We found that MSCs induced robust LINC01119 expression in neighboring TNBC cells, which was sufficient, on its own, in promoting cancer cell growth in multiple contexts in vitro, as well as tumorigenesis in immunecompromised mice. At the molecular level, we demonstrated that LINC01119 was indeed a non-coding RNA and that it functioned via upregulating pro-tumorigenic activities of SOCS5, a member of the suppressor of cytokine signaling family of proteins. Importantly, LINC01119 and SOCS5 were critical for cellular growth, and we found that the LINC01119-SOCS5 axis associated tightly with clinical TNBC and that it served as a powerful prognosticator of poor patient outcome.

RESULTS
LINC01119 is induced in MSC-stimulated TNBC cells and is tightly associated with clinical disease To identify lncRNAs of potential relevance to TNBC pathogenesis, we mined Affymetrix-based analyses in which we compared the gene expression profiles of GFP-labeled MDA-MB-231 TNBC cells recovered by FACS from 3-day co-cultures with human BM-MSCs versus that of sorted naïve GFP-BCC counterparts cultured alone as controls 15 (Supplementary Fig. 1a). These efforts led to the identification of 7 lncRNAs that were upregulated >1.5 folds with a q-value threshold (false-discovery rate) of~10% and with >98% probe specificity to the designated transcript ( Supplementary Fig.  1b). qRTPCR-based validation of these candidates revealed mild inductions of certain of these lncRNAs in the MSC-stimulated cells, such as TCONS_00019082_1 or TCONS_00004205_1, and stronger inductions in others, such as LINC01133 (Fig. 1a), a lncRNA we previously characterized 21 (Fig. 1a). Interestingly, however, the strongest induction was observed for LINC01119, which exhibited >100-fold upregulation in MSC-stimulated cells (Fig. 1a). As a novel lncRNA with previously undescribed functions in TNBC (or cancer in general), LINC01119 particularly attracted our attention.
LncRNAs show tissue and cell type-specific expression 18,19 . To explore if LINC01119 associated preferentially with clinical TNBC, we probed its expression levels in The Atlas of Non-Coding RNA In Cancer (TANRIC), a large RNA-seq database largely based on The Cancer Genome Atlas (TCGA) information. Here, we found that LINC01119 (as estimated by position chr 2: 47055003-47086145) was indeed significantly elevated in Basal clinical subsets when compared to luminal A/B or HER2 specimens (Fig. 1b). Similar patterns were observed when we examined the GENT2 database, which harbors >200 patient-derived specimens per breast cancer subtype (Fig. 1c). As the aforementioned RNA material was generated from specimens that undoubtedly contained not only carcinoma cells, but their stromal components as well, we were interested in determining if LINC01119 was enriched in cancer cells per se. For this purpose, we investigated LINC01119 expression in breast cancer cells profiled in the Cancer Cell Line Encyclopedia (CCLE, GSE36133; Supplementary Fig. 2a and Fig.  1d), and found it to be significantly associated with Basal A/B cells as well (a more definitive enrichment for LINC01119 within subsets of TNBC could not be demonstrated, although there was a tendency for its enrichment in M/MSL cells ( Supplementary Fig.  2b). These results indicated that LINC01119 is enriched in TNBC experimental models and in clinical tumor tissues.
LINC01119 expression is considered to be of relatively low abundance, with the highest levels recorded in tissues such as the ovary or brain ( Supplementary Fig. 3a-c). It emanates from chromosome 2p21 and its locus codes for 4 isoforms: TCONS_00003666 (isoform 1), TCONS_00003667 (isoform 2), TCONS_00002647 (isoform 3; the one annotated in the Affymetrix array) and TCONS_00002699 (isoform 4) (Fig. 2a). As lncRNA primary sequence dictates secondary structure, which in turn dictates function 22,23 , we sought to empirically identify the specific LINC01119 isoforms that were particularly induced in MSCstimulated cells. Here, isoform-specific qRTPCR analyses using primers specific for isoform 1/2, isoform 3, and isoform 4 ( Supplementary Fig. 4) revealed significant multifold induction of isoform 3 when compared to isoforms 1, 2, and 4 ( Fig. 2b), which we determined was the most predominant endogenous isoform of LINC01119 across TNBC cells (Fig. 2c). Furthermore, BM-MSC co-culture with established TNBC cells, such as the cell line HCC1143, or with primary TNBC cells, such as DT22 cells 24 , equally led to significant 2-fold and 15-fold inductions of LINC01119 isoform 3 in the admixed cancer cells, respectively (Fig. 2d, e), suggesting that LINC01119 induction by BM-MSCs was not particular to MDA-MB-231 cells. In addition, we found that cancer cells recovered from human BM-MSC-containing Nude-mousederived xenografts 16 also exhibited 3-4-fold stimulation of LINC01119 isoform 3 expressions compared to controls (Fig. 2f), indicating that this induction also occurred in vivo. Finally, since the Affymetrix probe set used to identify LINC01119 expression (probe 230799_at) in clinical specimens cannot distinguish between the different LINC01119 isoforms, we conducted lasercapture dissection of breast cancer cells from breast cancer samples of basal-like breast cancers (BLBC), luminal A, luminal B, and HER2 tumors and processed their RNA for qRTPCR using primers specific for isoform 3. Here too, we found several-fold enrichment of LINC01119 in carcinoma cells derived from BLBCs versus other types (Fig. 2g), consistent with our earlier observations using TANRIC (Fig. 1b) and with specific qRTPCR on a series of TNBC and non-TNBC cell lines (Fig. 2h). Together, these results prompted us to investigate the functional contributions of LINC01119 isoform 3 (heretofore LINC01119) to TNBC development.

LINC01119 promotes pro-tumorigenic traits in TNBC cells
To probe the functions of LINC01119 in TNBC biology, we cloned LINC01119 NR_024452 (which corresponds to isoform 3) and tested the effects of its overexpression in multiple TNBC cell lines ( Supplementary Fig. 5a). While LINC01119 did not enhance certain malignant traits of human cancer cells, such as resistance to suspension-induced cell death (anoikis; Supplementary Fig. 5b), it did promote significant increases in cell proliferation (Fig. 3a) and promoted cell cycle progression in SUM159 cells (Fig. 3b), enhanced the clonogenic growth of both SUM159 and MDA-MB-231 cells in 2D (Fig. 3c), as well as triggered 2-9-fold increase in the anchorage-independent growth of these cells in soft-agar assays (Fig. 3d). LINC01119 promoted similar growth phenotypes in mouse carcinoma cells too. Indeed, it caused a~4-fold increase in 4T1 clonogenic growth in vitro (Fig. 3e), and about doubled the ability of 67NR and 4T07 murine mammary TNBC cells to grow in anchorage-independence (Fig. 3f). Most importantly, LINC01119overexpressing SUM159 cells formed larger orthotopic tumors in immunocompromised NCG mice (Fig. 3g), which exhibited a~2fold increase in their Ki67-positivity compared to control tumors (Fig. 3h). Interestingly, antisense oligonucleotides (ASO)-mediated inhibition of endogenous LINC01119 ( Supplementary Fig. 6a) led to significant reductions in cellular growth in Hs578T, in MDA-MB-468, and in CAL51 TNBC cells too (Fig. 3i-k), altogether underscoring the critical pro-oncogenic abilities of LINC01119 both in vitro and in vivo.
LINC01119 is a non-coding RNA with predominant cytoplasmic localization We proceeded to identify how LINC01119 exerted its functional activities in support of tumor cell growth, focusing first on determining its protein-coding potential. Indeed, previous studies Z. Tu et al. indicated that certain lncRNAs, such as LINC00961 25 or HOXB-AS3 26 , in fact, coded for small peptides with proven functional activities. We, therefore, set out to determine if LINC01119 open reading frame (ORF) in effect produced a protein product, and did so using several approaches. First, we inserted a FLAG-tag at the 5prime end of the LINC01119 sequence and expressed the construct by transfection into HEK293T cells; however, immunoblotting of cell lysates with FLAG antibodies detected no product for LINC01119 in these cells, in contrast to FLAG-tagged EZH2, which was used as a protein-coding positive control (Fig. 4a). Second, we followed up on these in-cell results by conducting further experiments assessing LINC01119 protein-coding potential in stringently controlled experiments in vitro. Here, we inserted T7 promoter and Kozak sequences at the 5-prime end of LINC01119 (Fig. 4b) and conducted in vitro transcription followed by in vitro translation analyses with Biotin-labeled amino acids. While LINC01119 sequence was adequately transcribed, streptavidinbased Western immunoblotting could not detect a protein product for LINC01119 when compared to translated Luciferase or Xef1 proteins, detected at 61 and 50 kDa, respectively (Fig. 4b).
Third, having found LINC01119 to be amongst a large set of identified ribosome-associated lncRNAs 27 , and that it was predicted, albeit with very low probability, to code for small peptides derived from 4 putative coding segments (PCS; Fig. 4c), we proceeded to test this notion further. Since PCS-1, PCS-2, and PCS-3 share an identical 3-prime sequence, we cloned both PCS-1 and PCS-4 sequences in-frame into pFLAG-CMV-1 plasmid and expressed them in HEK293T cells. Here, no FLAG-tagged protein product was detected in the corresponding cell lysates (Fig. 4c). In addition, we specifically cloned PCS-1 downstream of the T7 promoter and Kozak and checked its protein-coding potential in vitro (Fig. 4d). However, no protein product was detected subsequent to in vitro transcription and translation of PCS-1 (Fig.  4d), further confirming the non-coding nature of LINC01119. Finally, since lncRNA functions are linked to their cellular localization 28,29 , we determined LINC01119 cellular distribution patterns via in situ hybridization with LINC01119-specific probes using RNAScope 30,31 . These results showed a predominant cytoplasmic localization for both endogenous (in SUM149 cells, one of the most enriched for LINC01119 (Fig. 2h)), and exogenous  LINC01119 expressed in HCC1937, SUM159, and MDA-MB-231 cells (Fig. 4e, f). Collectively, these results demonstrated empirically that LINC01119 is a non-coding RNA that functions predominantly from the cytoplasm.

LINC01119 stimulates SOCS5 in TNBC cells
To determine more specifically how LINC01119 exerted its protumorigenic activities in TNBC, we conducted large-scale gene expression analyses looking for genes whose expression most closely correlated with that of LINC01119 in the breast cancer cohorts found in TANRIC ( Supplementary Fig. 7a). When the top 5 genes from this list were cross-compared to LINC01119 cocorrelated genes across 23 TNBC cell lines profiled in CCLE, only one gene-suppressor of cytokine signaling 5 or SOCS5positively correlated with LINC01119 at R > 0.45 and p < 0.05 across these 2 databases ( Supplementary Fig. 7b). Indeed, SOCS5 associated preferentially with Basal and TNBC clinical specimens when compared to Luminal A, Luminal B, and HER2 subtypes from the GENT2 database (Fig. 5a), subsets in which LINC01119 was also more preponderant (Fig. 1), and we observed similar connections between LINC01119 and SOCS5 using our experimental systems. First, we found that MSCs, which stimulated LINC01119 in admixed cancer cells (Figs. 1 and 2), also caused~5 and~2-fold enrichment of SOCS5 expression in co-cultured MDA-MB-231 (Fig. 5b) and HCC1143 cells (Fig. 5c), respectively. In addition, LINC01119 was sufficient, on its own, in causing 2-3-fold induction of SOCS5 expression in HCC1937, SUM159, BT549, and MDA-MB-231 TNBC cells (Figs. 5d and S8). Conversely, 50% downregulation of LINC01119 levels in the LINC01119-high SUM149 cells using ASOs ( Supplementary Fig.   6a) caused >50% reduction in the endogenous basal SOCS5 levels (Fig. 5e). It is worthy to add that we probed whether LINC01119 induction of SOCS5 involved transcriptional or posttranscriptional mechanisms. Here, we found that the rates of SOCS5 mRNA or protein decay in actinomycin D or cycloheximide-treated cells were identical between control and LINC01119-overexpressing cells ( Supplementary Fig. 7c, d), indicating no discernable effect of LINC01119 on either mRNA or protein stability of SOCS5. These results collectively suggested that LINC01119 is both sufficient and necessary as an upstream regulator of de novo SOCS5 transcription. representative images of 3D anchorage-independent growth of the indicated MDA-MB-231 and SUM159 cell lines. Right: ImageJ quantitation of colony numbers in Left displayed as mean ± SD of n > 3. e Left: representative images of colony-formation assays on the indicated mouse 4T1 groups. Right: ImageJ quantitation of colony numbers in Left displayed as mean ± SD of n = 3. f Left: representative images of 3D anchorage-independent growth of the indicated mouse 67NR and 4T07 cell lines. Right: ImageJ quantitation of colony numbers in Left displayed as mean ± SD of n > 3. g Weight of the indicated SUM159 tumors grown orthotopically in NCG mice after 42 days (n > 5 per group). h Left: representative images of immunohistochemistry of Ki67 in tumor tissues in g. Right: quantitation of Ki67 positive cells in Left displayed as box-and-whisker plots representing the median (centerline) and inter-quartile range (IQR; box). The whiskers extend up to 1.5 times the IQR from the box to the smallest and largest points. i, j Proliferation (mean ± SD of n = 3) of Hs578T (i) and MDA-MB-468 (j) cells transfected with controls (NC) or with LINC01119 ASOs measured using WST-1 assay after 48 h. k Proliferation (mean ± SD of n = 3) of CAL51 cells transfected with control or with a combination of LINC01119 ASOs measured using CellTiter after 48 h. SOCS5 belongs to the family of suppressors of cytokine signaling proteins, which are broadly categorized as negative regulators of the Janus Activated Kinase (JAK) and Signal Transducer and Activator of Transcription (STAT) pathway and are thought to exert major roles in attenuating receptor-initiated signaling [32][33][34] . In particular, SOCS5 has been shown to inhibit JAK1/2 and STAT1/3 phosphorylation in different cellular contexts [35][36][37][38] , so we probed the phosphorylation status of these proteins in SUM159 and MDA-MB-231 cells stably expressing LINC01119 in order to determine if SOCS5 downstream activities were also revved up in accompaniment to SOCS5 mRNA and protein induction; however, we did not find any inhibition of phospho-JAK1, phospho-JAK2, phospho-STAT1, or phospho-STAT3 by LINC01119 (Supplementary Fig. 8a-d). We therefore expanded our investigation to include all other members of the JAK and STAT proteins. We observed unique and reproducible downregulation of phospho-STAT6 in both cell lines under serumstarved and non-serum-starved conditions ( Supplementary Fig.  8a-d). These results were confirmed in cancer (and non-cancer) cells transiently overexpressing exogenous human SOCS5 (Supplementary Fig. 8e), echoing prior reports in which SOCS5 was described to inhibit STAT6 activation in dendritic cells 39 or in T helper (Th) subsets 40 . Notably, and in genetic support of these observations, we found that STAT6 was most depleted in TNBC specimens in TCGA ( Supplementary Fig. 8f), as well as in TNBC-rich p53 mutant samples (Supplementary Fig. 8g) and that SOCS5 and STAT6 exhibited negative correlation in BLBC and ER-negative clinical specimens (Supplementary Fig. 8h, i). Together, these observations indicated that LINC01119-induced SOCS5 is active in TNBC cells.
SOCS5 is a critical regulator of TNBC cell growth We proceeded to determine the functional roles for SOCS5 in TNBC cell growth. We found that overexpression of human SOCS5 in SUM159 (Supplementary Fig. 8 and Supplementary Fig. 9a) induced a~5-fold increase in anchorage-independent growth in soft-agar (Fig. 6a), and generated tumors that grew significantly faster and larger in Nude mice (Fig. 6b, c), and with~2-fold as many Ki67-positive carcinoma cells (Fig. 6d), indicating that SOCS5, similar to LINC01119, was sufficient, on its own, in promoting malignant growth both in vitro and in vivo. To test the essentiality of SOCS5, we screened shRNAs against SOCS5 and selected two hairpins that resulted in~70% inhibition of its mRNA (Supplementary Fig. 9b). Expression of these two shRNAs in Hs578T TNBC cells resulted in >50% inhibition of the cells ability to grow in soft-agar conditions ( Supplementary Figs. 9c and 6e), suggesting that SOCS5 performed essential pro-tumorigenic functions in these contexts. Identical results were obtained in two additional TNBC models, SUM159 and CAL51, in which shSOCS5#4 caused~50% inhibition of growth in soft-agar ( Supplementary Figs. 9d, e, and Figs. 6f, g). Most importantly, knockdown of SOCS5 by enhanced siRNA (esiRNA; Supplementary  Fig. 9f) abrogated LINC01119-induced cell proliferation (Fig. 6h), in support of the notion that SOCS5 is an essential partner of LINC01119 that performs critical functions in TNBC cell growth.
The LINC01119-SOCS5 axis is prognostic of poor patient outcome We next probed the clinical relevance of our findings across publicly accessible gene expression databases of breast cancer. We found that SOCS5 positively correlated with LINC01119 across multiple different clinical breast cancer datasets, which included GSE28844 (Fig. 7a), GSE16446 (Fig. 7b), GSE102484 (Fig. 7c), and GSE12276 (Fig. 7d). Similar findings were observed using the large TNBC cohort of Brown and colleagues (GSE76124) 41 ; Fig. 7e) and in the BRCA1-mutant specimens of GSE27830 (Fig. 7f), tumors that have a high propensity to classify with triple-negative BLBCs. Furthermore, high expression levels of LINC01119 and SOCS5 associated with shorter relapse-free survival (RFS) in breast cancer patients in general (Fig. 7g) and poorer overall survival (OS) in those diagnosed with BLBC in particular (Fig. 7h). Altogether, these findings are consistent with a model in which the stromaregulated LINC01119-SOCS5 axis assumes critical pro-malignant functions in TNBC development, and that it represents both a therapeutic target (Figs. 3 and 6) and a prognosticator of adverse patient outcome in disease management.

DISCUSSION
In the present work, we utilized the MSC:BCC co-culture model to reveal functions for lncRNAs in regulating TNBC development. Thus, we found that MSC-derived triggers induced LINC01119 expression in neighboring TNBC cells, which in turn stimulated SOCS5, leading to accelerated in vitro cancer cell growth, both in adhesion and in suspension, as well as accentuated in vivo tumor formation in xenografted mice. In addition, LINC01119 and SOCS5 expression were significantly enriched in several TNBC patient cohorts and exhibited substantial and tight correlation with one another across multiple breast cancer gene sets. Importantly, LINC01119/SOCS5 repression severely impaired cancer cell growth, and the pathway served as a powerful prognostic indicator of adverse outcomes in TNBC patients. Collectively, our findings have delineated previously undescribed non-coding theranostic elements of potential translational utility in TNBC management.
To our knowledge, this is the first report that describes cellular activities of LINC01119 and that begins to elucidate its functional downstream effectors. Although there are no previous links that have tied LINC01119 to malignancy, it is important to point out that LINC01119 is located on chromosome 2p, an arm noted by several groups to undergo aberrations in breast cancer 42,43 and to exhibit elevated probability for harboring susceptibility loci for breast 44 and other neoplasms, such as endometrial 45 or renal 46 carcinomas. In addition, several genes that map to the vicinity of LINC01119 on 2p21 have been implicated in advancing tumorigenesis and metastasis in multiple malignancies, such as PKCε 47 or SOS1 48 . These observations, together with reports that underscore the ability of lncRNAs to regulate neighboring genes in cis 49 suggest a potential role for LINC01119 in the etiology and development of cancers beyond TNBC. Notable amplifications of LINC01119 in a number of cancers across the TCGA, such as pancreatic, uterine, and lung cancers ( Supplementary Fig. 10) would indeed be consistent with this hypothesis.
Prior work indicated that SOCS5 falls under the tight regulatory control of complex non-coding RNA networks that involve an expanding list of both microRNAs and lncRNAs. Indeed, several miRs have been described to directly suppress SOCS5 expression in a variety of cancer cells, such as miR-9 in prostate cancer 50 , miR-885 in colorectal cancer 51 , miR-301a in pancreatic cancer cells 33 , or miR-18, miR-25, and miR-589 in liver cancer 52,53 . In contrast, a handful of lncRNAs have been shown to induce SOCS5 expression in target cancer cells, and these include HAND2-AS1 in liver cancer 54,55 , FER1L4 in osteosarcoma 56 , TUSC7 in endometrial carcinoma 57 , MEG3 in oral squamous cell carcinoma 58 , and LINC00668 in glioma 59 . Interestingly, a unifying mechanism-ofaction of such lncRNAs in SOCS5 regulation appears to involve lncRNA sponging SOCS5-specific microRNAs, such as miR-18a, miR-616, or miR-584d, thereby relieving otherwise suppressed SOCS5 levels. Similar high-stringency miRBase-based approaches we conducted using our own analyses, however, only identified miR-3689d as a potential SOCS5-miR that can be sponged by LINC01119, but our preliminary findings indicated that miR-3689d was not induced in our MSC-stimulated cancer cells 16 and that it was not modulated in LINC01119 overexpressing cells at all (data not shown). These findings, in addition to observations that SOCS5 induction in LINC01119-over-expressing cells was not due to upregulations in its mRNA or protein stabilities (Fig. S7c-d), suggested that LINC01119-stimulated increases in SOCS5 likely involve de novo transcriptional stimulation. Here, we posit that cytoplasmic LINC01119 (Fig. 4) sequesters a transcriptional regulator that otherwise suppresses SOCS5 transcription. Verification of this hypothesis awaits the detailed elucidation of the molecular interactions of LINC01119.
SOCS5 functioned as a partner of LINC01119, and we found that it was both sufficient and necessary to promote TNBC cell growth across several TNBC cells and that it promoted tumor growth in nude mice. Although a role for SOCS5 as a promoter of TNBC pathogenesis has not been previously described per se, our results were surprising considering several reports describing SOCS5 as a suppressor of cancer traits in the context of malignancies that included T cell lineage acute lymphoblastic leukemia (T-ALL) 34 , pancreatic cancer 60 , hepatocellular carcinoma (HCC) 52 , or prostate cancer 50 . How SOCS5 mechanistically exerted these suppressive activities has largely been attributed to its ability to mediate negative feedback loops that downregulated membrane tyrosine kinases, especially EGFR 61,62 . In these models, SOCS5 is thought to associate with autophosphorylated EGFR to then assemble a complex containing elongin B/C and E3 ubiquitin ligase that then targets EGFR for proteasomal degradation 63,64 . Although SOCS5 is postulated to act on other targets, such as Shc-1 or YAP1, in similar manners 36,65 , there is no evidence that it acts on other growth factor receptors, such as FGF or NGF. Whether SOCS5 engages in similar interactions in our systems is presently unknown.
In spite of these reports, however, the pro-oncogenic roles we describe for SOCS5 in TNBC cells are in agreement with prior (albeit fewer) reports indicating tumor-promoting activities for SOCS5. These include findings in which SOCS5 levels were shown to be induced in cancerous versus normal breast cells (including the TNBC cell line HCC1937 we used here) 66 , and in which SOCS5 proved to be essential to the viability of other breast cancer cells, such as the HER2 cell line SKBR3 67 . Pro-malignant roles for SOCS5 were also observed in hepatocellular carcinoma as well, where SOCS5 inhibition induced autophagy and compromised lung metastasis of HCC cells, and where high SOCS5 levels were prognostic of poor patient outcome 68 . Why SOCS5 seemingly performs diametrically opposing functions in tumor pathogenesis across different cancer types, however, remains an outstanding question. Here, and in keeping with reported functions for SOCS proteins as suppressors of JAK/STAT pathways, we did find that SOCS5 (and LINC01119) inhibited the phosphoactivation of STAT6, but not the traditional targets STAT1 or STAT3. Of pertinence, STAT6 has been described to act as a tumor suppressor in breast cancer 69,70 , providing a possible mechanism-of-action of LINC01119/SOCS5 in driving TNBC cell growth by curtailing STAT6 activation. In support of this notion are genetic data demonstrating STAT6 negative correlation with both LINC01119 and SOCS5 in patient cohorts, and the fact that STAT6 downregulation was itself associated with TNBC and forecasted poor TNBC patient prognosis (Supplementary Fig. 8f-g, j). Hence, we posit that LINC01119 induces SOCS5 expression, which then inhibits STAT6, relieving its tumor-suppressive functions in TNBC cells. The extent to which this pathway operates identically in additional breast cancer subtypes (or other cancers altogether) will necessitate the detailed determination of cellular SOCS5 targets, partners, and mechanismof-action in a context-dependent fashion.
A final note is that the nomenclature of lncRNAs has followed the norm of naming lncRNAs according to the closest gene in their locus, and it is for this reason that LINC01119 is also known as lncSOCS5. For LINC01119, the nomenclature also carries functional connotation since we have shown that LINC01119 regulated SOCS5 expression and that SOCS5 served critical roles in LINC01119-regulated activities. The fact that SOCS5 and LINC01119 are also syntenic in humans suggests that they may have genetic interactions beyond the functional ones we describe here and that LINC01119 and SOCS5 act as an axis (or a duo of sorts) in regulating downstream functions.  Breast cancer cell lines MDA-MB-231, MDA-MB-468, HCC1937, BT20,  HCC1143, BT549, and Hs578T cells were procured from American Type  Culture Collection (ATCC). HCC70, T47D, ZR75, SUM149, CAL51, and  SUM159 were obtained

Gene expression analyses
Comparative gene expression profiling of MSC-stimulated cancer cells versus controls was performed as previously described 21 . Briefly, BM-MSCs were mixed with GFP-labeled BCCs (at 3:1 ratio of MSC:cancer cell) and cultured in DMEM-10% FBS for 72 h with GFP-BCCs cultured alone serving as controls, as we previously described 15 . Cultures were subsequently washed with PBS, trypsinized, centrifuged for 3 min at 1200 rpm, and resuspended in ice-cold PBS. Suspensions were then passed through 70 µm filters and sorted for GFP-positivity using FACS Aria II (BD) with similarly processed control GFP-BCCs cultured alone used to determine sorting gates based on cell size and GFP fluorescence intensity. Gates were set to exclude cell debris, potential aggregates, and to collect cancer cells with the strongest GFP expression to ensure avoidance of GFP-negative MSC contamination. For Affymetrix analyses, 1 µg total RNA was recovered from FACS-isolated MDA-MB-231 cells, processed through library preparation using GeneChip HT One-Cycle cDNA synthesis kit and Genechip HT IVT Labeling kit (900687 and 900688; Affymetrix, Santa Clara, CA), and hybridized to HT-HG U133 2.0 Plus chip (900751; Affymetrix, Santa Clara, CA).
For clinical specimens, qRTPCR measurements were performed on total RNeasy (Qiagen, Hilden, Germany) -purified RNA derived from lasercaptured cancer cells macrodissected from specimen slides corresponding to different breast cancer subtypes.

Transcription and translation assays
DNA templates for in vitro transcription were generated by PCR of either full-length LINC01119 or of the potential coding sequence #1 of LINC01119 (232 bp) from the pLVX-LINC01119 plasmid template amplified using T7promoter-inserting and Kozak-inserting primer sequences and reverse primers annealing to the poly-A tail. In vitro translation was conducted with Transcend non-radioactive translation detection system (L1170, Promega, Madison, WI). For in-cell translation of full-length or potential LINC01119 ORFs, sequences were expressed from pFLAG-CMV-1 plasmid transfected into HEK293T cells. qRTPCR was used to ensure expression efficiency after 48 h, followed by Western blots at 72 h to detect protein products. For in-cell translation analyses of LINC01119 potential coding sequence (PCS), FLAG sequence was inserted 3' of LINC01119 PCSs using QuikChange kit (200555)

RNAscope
RNAscope-based in situ hybridization for LINC01119 was performed using ACD HybEZ™ II Oven with RNAscope 2.5 HD detection reagent-Brown according to manufacturer's protocols (Advanced Cell Diagnostics, Hayward, CA) with 18 ZZ probe pairs synthesized against sequence 29-957 of NCBI accession number NR_024452.1 for human-LINC01119 (#537331, ACD). Probes for human peptidylprolyl isomerase B (PPIB) and for bacterial dapB were used as positive and negative controls, respectively.
Proliferation, cell cycle, anchorage-independent growth, suspension, and colony-formation assays For cell proliferation assays, cancer cells were seeded in quadruplicates in 12-well plates (25.0 × 10 3 per well) and counted using the Trypan blue exclusion assay every day for 4 days. Alternatively, cell growth was monitored in cancer cells seeded into 96-well plates (5.0 × 10 3 per well), and growth was estimated using WST-1 (Millipore Sigma, St. Louis, MO) or CellTiter 96 (G3580, Promega, Madison, WI) at the indicated days. For cell cycle analyses, 1 × 10 6 cold-PBS-washed cells were suspended in 500 µl PI/ Triton X-100 staining solution. Data acquisition was performed using flow cytometry and analyzed by ModFit LT (Macintosh). For anchorageindependent growth, cancer cells were mixed with equal volumes of 0.35% agar and seeded into pre-coated (0.625% agar) in 6-well dishes at a density of 5.0 × 10 3 cells per well. Colonies were stained with 0.002% Crystal Violet after 3 weeks and colony counts estimated using ImageJ software (NIH Image). For suspension assays, a total of 5.0 × 10 3 cancer cells were suspended in 1.7 ml Eppendorf tubes containing 1.5 ml DMEM medium with 0.1% FBS under constant tumbling rotation. Live cells were counted using the Trypan blue exclusion assay every 24 h for the duration of the experiment. For low-density adherent colony-formation cultures, cancer cells (500 cells) were plated in 6 cm dishes, fed and maintained for 2 weeks, then growths fixed with 100% methanol for 20 mins. Colonies were stained with Crystal Violet and counted using ImageJ software (NIH Image).  (Dilution 1:1000). Blots were developed using chemiluminescence (BioRad, Hercules, CA). All blots derive from the same experiment and were processed in parallel.

Immunohistochemistry (IHC)
IHC was performed using standard techniques. Nuclei with any detectable Ki67 antibody (EPR3610; ab92742; Abcam; 1:200 dilution) staining above background levels (negative control without primary antibody) were scored as positive cells. Positive staining was scored blindly in at least five random fields per slide at ×200 magnification.

Clinical analyses
Association of LINC01119 with clinical breast cancer subtypes was derived from TANRIC database (ibl.mdanderson.org) on TCGA data (LINC01119 was queried by position chr2:47055003-47086145). LINC01119 expression levels in laser-captured tissues were estimated using qRTPCR on Qiagenpurified RNA extracted from cancer cells lifted off BLBC, HER2, luminal A, and luminal B tissue slides prepared at the Curie Institute and derived from human specimens collected in compliance with ethical regulations, informed patient consent, and approval of the Curie IRB. LINC01119 and SOCS5 expression levels in clinical breast cancer specimens were obtained from GENT2 database 71 . STAT6 expression levels in clinical breast cancer specimens were obtained from UALCAN (http://ualcan.path.uab.edu) that included subclass and TP53 mutation status. For data display, we chose the box-and-whisker plots, in which the median is represented by the centerline. The whiskers extend up to 1.5 times the Inter Quartile Range from the box to the smallest and largest points. Correlation analyses between LINC01119 (probe: 230799_at) and SOCS5 (probe: 209648_x_at) were performed using breast cancer data sets in CCLE (GSE36133), GSE28844, GSE16446, GSE102484, GSE12276, GSE76124, and GSE27830 all derived from the GEO database. Correlation analyses between SOCS5 (probe: 209648_x_at) and STAT6 (probe: 201331_s_at) were performed using R2 data (https://hgserver1.amc.nl/cgi-bin/r2/main.cgi). For patient survival curves, median-centered log ratios of both LINC01119 and SOCS5 (or STAT6) were partitioned into high-expression and low-expression specimen groups, and analyses were conducted using the log-rank test and the proportional hazard model to compare KM survival curves.

Statistical analyses
A two-sided unpaired Student's t-test was used to analyze the significance between means ± SD of at least three independent biological replicates. One-way ANOVA followed by Tukey posthoc test was performed when appropriate to analyze the differences between individual experiments in multiple comparisons. Bivariate Pearson correlation (SPSS: version 23) was used to test LINC01119 and SOCS5 associations. For all analyses, *, **, and *** indicated p < 0.05, p < 0.01, and p < 0.001, respectively.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The data generated and analyzed during this study are described in the following data record: https://doi.org/10.6084/m9.figshare.14377130 72 . The RNA sequencing data are openly available in the Gene Expression Omnibus via the following accession: GSE171121 73