The HASTER lncRNA promoter is a cis-acting transcriptional stabilizer of HNF1A

The biological purpose of long non-coding RNAs (lncRNAs) is poorly understood. Haploinsufficient mutations in HNF1A homeobox A (HNF1A), encoding a homeodomain transcription factor, cause diabetes mellitus. Here, we examine HASTER, the promoter of an lncRNA antisense to HNF1A. Using mouse and human models, we show that HASTER maintains cell-specific physiological HNF1A concentrations through positive and negative feedback loops. Pancreatic β cells from Haster mutant mice consequently showed variegated HNF1A silencing or overexpression, resulting in hyperglycaemia. HASTER-dependent negative feedback was essential to prevent HNF1A binding to inappropriate genomic regions. We demonstrate that the HASTER promoter DNA, rather than the lncRNA, modulates HNF1A promoter–enhancer interactions in cis and thereby regulates HNF1A transcription. Our studies expose a cis-regulatory element that is unlike classic enhancers or silencers, it stabilizes the transcription of its target gene and ensures the fidelity of a cell-specific transcription factor program. They also show that disruption of a mammalian lncRNA promoter can cause diabetes mellitus.

The biological purpose of long non-coding RNAs (lncRNAs) is poorly understood. Haploinsufficient mutations in HNF1A homeobox A (HNF1A), encoding a homeodomain transcription factor, cause diabetes mellitus. Here, we examine HASTER, the promoter of an lncRNA antisense to HNF1A. Using mouse and human models, we show that HASTER maintains cell-specific physiological HNF1A concentrations through positive and negative feedback loops. Pancreatic β cells from Haster mutant mice consequently showed variegated HNF1A silencing or overexpression, resulting in hyperglycaemia. HASTER-dependent negative feedback was essential to prevent HNF1A binding to inappropriate genomic regions. We demonstrate that the HASTER promoter DNA, rather than the lncRNA, modulates HNF1A promoter-enhancer interactions in cis and thereby regulates HNF1A transcription. Our studies expose a cis-regulatory element that is unlike classic enhancers or silencers, it stabilizes the transcription of its target gene and ensures the fidelity of a cell-specific transcription factor program. They also show that disruption of a mammalian lncRNA promoter can cause diabetes mellitus.
The transcription of genes is controlled by cis-acting promoter and enhancer sequences, many of which harbour disease variants. Mammalian genomes also contain >20,000 long non-coding RNAs (lncRNAs) 1,2 . Although the function of most lncRNAs has not been explored, some lncRNAs are known to regulate gene transcription 3,4 . A considerable number of lncRNAs are transcribed from evolutionarily conserved promoters located near genes encoding lineage-specific regulators 3,5-7 , suggesting a cis-regulatory function. For some lncRNAs, knockdown experiments have revealed transcriptional effects on nearby genes [8][9][10] , while genetic studies have demonstrated bona fide cis-regulatory functions of selected lncRNAs 3,[11][12][13][14][15][16] . There are nevertheless still major gaps in our understanding of the regulatory purpose of cis-acting lncR-NAs and how they are fundamentally different from more established gene regulatory elements. Furthermore, the extent to which genetic disruption of cis-regulatory lncRNAs can lead to physiologically relevant phenotypes is unclear.
In this study, we examined HASTER, the promoter of an lncRNA at the HNF1A homeobox A (HNF1A) locus. Mutations in HNF1A, encoding a homeodomain transcription factor 17 , cause maturity-onset diabetes of the young type 3, the most frequent form of monogenic diabetes Nature Cell Biology | Volume 24 | October 2022 | 1528-1540 1529 Article https://doi.org/10.1038/s41556-022-00996-8 site in human islets and an additional downstream start site in other tissues ( Fig. 1a and Extended Data Fig. 1a). Both transcriptional start sites are located in evolutionarily conserved sequences that show active promoter chromatin (high H3K4me3 and low H3K4me1) in islets and liver ( Fig. 1b and Extended Data Fig. 1a). HASTER is expressed exclusively in HNF1A-expressing tissues, including the liver, gut, pancreas and kidney, and has the same antisense configuration across species (Fig. 1b and Extended Data Fig. 1b,c). Subcellular fractionation of EndoC-βH3 human β cells showed that HASTER transcripts were associated with chromatin, and single-molecule fluorescence in situ hybridization showed that HASTER transcripts were exclusively present in one or two nuclear foci that co-localized with HNF1A nascent transcripts ( Fig. 1c and Extended Data Fig. 2a-c). Therefore, HASTER transcribes an evolutionarily conserved nuclear lncRNA that is co-expressed with HNF1A across tissues.

HASTER is a negative regulator of HNF1A
To study HASTER function, we created a 320-base-pair (bp) deletion of the main HASTER promoter (P1) in human embryonic stem cells (hESCs) (Fig. 2a) and differentiated them into hepatocyte-like cells 32 . In control cells, HASTER transcripts were already detected at maximal levels at the hepatoblast stage, while HNF1A messenger RNA (mRNA) increased gradually during maturation to hepatocytes (Fig. 2b). HASTER-deleted cells showed increased hepatocyte HNF1A mRNA (mean = 1.3-and 1.6-fold versus control cells for two independent deletions; P = 0.01 mellitus 18 , while rare and common variants predispose to type 2 diabetes 19,20 . Studies of homozygous Hnf1a null mutant mice have shown that HNF1A is essential for differentiated cell programs in various organs, whereas human HNF1A haploinsufficiency causes diabetes due to selective abnormalities in pancreatic β cells, indicating that the gene dosage sensitivity of HNF1A is cell specific 18,[21][22][23][24][25][26] . We now show that HASTER is a cell-specific cis-acting transcriptional stabilizer of HNF1A and demonstrate that disruption of this function causes diabetes mellitus in mice.

Results
Evolutionarily conserved co-expression of HNF1A and HASTER HNF1A-AS1, or Hnf1a-os1 and Hnf1a-os2 in mice, is a putative non-coding transcript that is transcribed from intron 1 of HNF1A and runs in antisense configuration (Fig. 1a). In the present study, we focus on the regulatory function of the promoter of HNF1A antisense transcripts. We named this DNA region HASTER (HNF1A stabilizer). HNF1A antisense transcripts, which we refer to as HASTER RNAs, have previously been proposed to exert trans-regulation of proliferation in cell-based models 14,[27][28][29][30][31] , but so far the transcriptional cis-regulatory function of the lncRNA or its promoter have not been characterized with genetic tools.
We used cap analysis gene expression sequencing (CAGE-seq), RNA sequencing (RNA-seq) and 3′ rapid amplification of complementary DNA ends (RACE) to show that HASTER transcribes myriad transcript isoforms that originate from a major upstream transcriptional start  and P = 0.04, respectively; Student's t-test) (Fig. 2b). Thus, HASTER exerts negative regulation of HNF1A in an in vitro human liver cell model.
To examine this function in vivo, we generated mice with LoxP sites flanking a 1.8-kilobase (kb) region containing Haster transcriptional start sites ( Fig. 2c and Extended Data Fig. 3a,b) and used a liver Cre transgene 33 to breed liver-specific Haster homozygous deletions (Haster LKO ). Haster LKO mice were born at Mendelian rates and showed normal organ formation, weight and glucose homoeostasis (Extended Data Fig. 3c,d). Consistent with human mutant cells, Haster LKO mice showed increased liver Hnf1a mRNA (1.5 ± 0.3-fold) and protein (4.5 ± 0.6-fold) (Fig. 2d-f). Similar results were observed in germline Haster mutant mice (Extended Data Fig. 3e). Thus, HASTER negatively regulates HNF1A in mouse and human hepatic cells.

HNF1A is a positive regulator of HASTER
The observation that HASTER modulates HNF1A hinted at a feedback mechanism. To examine whether HNF1A in turn regulates HASTER, we studied HNF1A-deficient cells. HASTER was strongly downregulated in pancreatic islets and liver from homozygous Hnf1a null mutant mice and in HNF1A-deficient EndoC-βH3 human β cells (Fig. 2g-i). HASTER transcripts seemed highly sensitive to HNF1A levels because partial HNF1A knockdown caused markedly decreased HASTER and only marginal changes in other HNF1A-dependent genes such as HNF4A 34 (Extended Data Fig. 4a). Conversely, upregulation of Hnf1a mRNA by ~30-80% through CRISPR-Cas9 synergistic activation mediator (CRISPR-SAM) led to ~50-120% increased Haster RNA (Extended Data Fig. 4b). This effect was probably direct because the HASTER promoter has seven HNF1A recognition sequences that are bound by HNF1A in mouse liver and human EndoC-βH3 β cells (Fig. 2j). These results suggested that the HASTER promoter functions as a HNF1A-sensing platform that drives HASTER transcription in accordance with HNF1A concentrations. Taken together, our observations revealed a negative feedback loop in which HNF1A positively regulates HASTER while HASTER negatively regulates HNF1A (Fig. 2k).

HASTER negative feedback controls HNF1A pioneer-like activity
To investigate the consequences of disrupting this feedback loop, we performed RNA-seq on liver from Haster LKO and control mice (Fig. 3a and  Supplementary Table 1). Consistent with the increased HNF1A levels in Haster LKO liver, deregulated transcripts and functional annotations were negatively correlated with those of Hnf1a knockout liver 22 (Fig. 3b,c and Extended Data Fig. 3f). A subset of genes that were most strongly upregulated in Haster LKO liver were, however, specifically expressed in kidney or intestine-two other HNF1A-expressing organs ( Fig. 3c and Extended Data Fig. 5a). Therefore, Haster mutations led to increased expression of HNF1A-dependent liver genes, but also activated ectopic transcription.
Next, we examined HNF1A genomic binding in Haster LKO liver. Overall, the HNF1A binding strength was increased in Haster LKO liver; 325 peaks showed increased HNF1A binding at a false discovery rate (FDR) of ≤0.05 (Fig. 3d). Remarkably, Haster LKO liver showed HNF1A neo-binding sites at 105 regions that were not bound by HNF1A in control livers (Fig. 3d-f).
HNF1A can bind in vitro to nucleosomal DNA 35 and has been used to activate repressed liver genes in fibroblasts and reprogram them into hepatocytes 36 -two properties of pioneer transcription factors 37 . Although pioneer transcription factors have the ability to bind inaccessible chromatin, they typically show stable binding to different genomic regions across tissues 22,38 , suggesting that cell-specific parameters, such as perhaps cellular transcription factor concentrations, might influence their in vivo binding selectivity and the capacity to create accessible chromatin. In keeping with this notion, HNF1A neo-binding sites did not show accessible chromatin in normal liver (Fig. 3e,f), whereas they showed classical active chromatin modifications (H3K4me3 and H3K27ac) in Haster LKO liver ( Fig. 3g and Extended Data Fig. 5b-f). Interestingly, HNF1A neo-binding sites contained canonical high-affinity HNF1 binding motifs, suggesting that many could be bona fide HNF1A targets in other HNF1A-expressing tissues (Fig. 3h). Thus, increased HNF1A in Haster LKO liver resulted in the creation of new binding sites, which led to the formation of new active chromatin regions.
Increased HNF1A binding at pre-existing active gene promoters in Haster LKO liver led to increased gene expression; around one-quarter of genes in this class showed greater than twofold higher expression in Haster LKO (Extended Data Fig. 5d). HNF1A neo-binding events in newly activated promoter regions led to ectopic activation of genes that are normally not expressed in liver, such as the kidney-enriched genes Ggt and Tinag (Fig. 3f and Extended Data Fig. 5d,e). Consistently, several HNF1A neo-binding sites did not show accessible chromatin in normal liver yet showed accessible chromatin in other HNF1A-expressing tissues such as kidney (Fig. 3c,f and Extended Data Fig. 5a,e). Some newly activated promoters did not overlap with any annotated mouse transcription start site, suggesting that increased HNF1A could also activate aberrant de novo promoters (Extended Data Fig. 5f,g).
In summary, genetic disruption of the HASTER feedback loop led to increased cellular HNF1A concentrations, which caused either super-activation of pre-existing HNF1A-bound promoters or the transformation of silent inaccessible chromatin into active promoters (Fig. 3i). This indicates that the HASTER feedback is crucial to control the pioneering-like activity of HNF1A, and to fine-tune the tissue specificity of HNF1A-dependent transcriptional programs.

Haster inactivation causes diabetes
HNF1A haploinsufficiency leads to pancreatic β cell dysfunction and diabetes 18 . To examine Haster in pancreatic cells, we used a Pdx1-Cre transgene to excise Haster in all pancreatic epithelial lineages (Haster pKO mice). Haster pKO mice showed normal morphology and growth (Extended Data Fig. 6a), yet male mice displayed glucose intolerance with insulin deficiency by 8 weeks, as well as fasting hyperglycaemia (glycaemia = 137 ± 16 mM in Haster pKO , 87 ± 5 mM in Haster f/f littermates and 98 ± 4 mM in Pdx1-Cre; t-test P < 0.05) (Fig. 4a,b and Extended Data Fig. 6b). Male mice with germline mutations (Haster −/− ) were born at Mendelian rates and showed no overt manifestations, but also showed diabetes, glucose intolerance and hypoinsulinaemia ( Fig. 4c-e and Extended Data Fig. 6c,d). Thus, inactivation of Haster in the germline or in the pancreas led to impaired insulin secretion and diabetes.

Haster knockout leads to HNF1A induction or silencing in islet cells
Haster pKO and Haster −/− pancreas showed increased HNF1A immunoreactivity in all acinar cells and in many endocrine cells (Fig. 4f). This confirmed that Haster also acts as a negative regulator of Hnf1a in the pancreas. However, numerous other islet endocrine cells from 8-to 12-week-old Haster pKO and Haster −/− mice were completely devoid of HNF1A immunoreactivity (Fig. 4f).
To further understand Haster-dependent regulation of pancreatic HNF1A expression, we analysed mice in which Haster was deleted at different stages. At embryonic stage E11.5, most Haster −/− multipotent pancreatic progenitors showed markedly heterogeneous HNF1A expression, with many cells showing low or no HNF1A expression, whereas HNF1A expression was uniform in surrounding primitive gut cells (Fig. 4g). At embryonic stage E15.5, β cells from Haster −/− and Haster pKO embryos also showed highly variable HNF1A levels, ranging from an apparent absence in many cells to marked overexpression in 1-5% of β cells (Extended Data Fig. 6e-h). This contrasted with highly uniform HNF1A staining in control embryonic β cells (Extended Data Fig. 6e,f). This dual phenotype became more evident if Haster pKO and Haster −/− mice were analysed postnatally, with more visible HNF1A-negative cells (62 and 80%, respectively) and more HNF1A-overexpressing cells      (24 and 10%, respectively) ( Fig. 4e,f). Inactivation of Haster after the formation of β cells, however, resulted in very few HNF1A-negative β cells and more frequent HNF1A overexpression (Extended Data Fig. 6i-k). Extended Data Fig. 6e summarizes the results from different models. Thus, Haster inactivation caused a unique variegated HNF1A expression phenotype in β cells, with co-existing silencing and overexpression. Therefore, Haster acts as a negative regulator of HNF1A in the pancreas, as in the liver, but also has a developmental cell-specific role to ensure HNF1A expression in early pancreatic progenitors and islet endocrine cells. Importantly, Haster is essential for β cell function and glucose homoeostasis.

Variegation of Haster-deficient islet cell transcriptomes
Next, we defined the transcriptional impact of HNF1A expression heterogeneity. We performed single-cell RNA-seq of islet cells from Haster pKO and control mice (Supplementary Table 2) and used graph-based clustering to separate major endocrine cell types (Extended Data  Tables 3 and 4). This β HNF1A low cluster was specific to Haster pKO islet cells, constituted 5-21% of β cells and was discernible with independent clustering methods (Extended Data Fig. 7d-f). β HNF1A low cells were less abundant than expected from immunostainings, possibly due to a known propensity of Hnf1a knockout cells to dissociate during islet isolation. Thus, Haster mutations caused either functional HNF1A deficiency in pancreatic β cells, which is known to cause diabetes, or overexpression of HNF1A-dependent genes. Haster, therefore, acts to ensure the stability of β cell HNF1A-regulated programs.

HASTER modulates HNF1A in human pancreatic progenitors
Next, we investigated whether HASTER also regulates HNF1A in human pancreatic cells. Analysis of published datasets showed that HASTER is activated during the early stages of hESC-derived pancreatic differentiation 39 (Fig. 4i). To test HASTER function in human pancreatic progenitors, we used the hESC clones carrying the 320-bp HASTER P1 deletion ( Fig. 2a) and generated pancreatic progenitors 40 . In contrast with the results after hepatic differentiation, which showed increased HNF1A mRNA, HASTER knockout pancreatic progenitors showed a 62% decrease of HNF1A mRNA and low heterogenous HNF1A protein levels ( Fig. 4j,k). These results showed that HASTER also acts as an essential organ-specific positive regulator of HNF1A in human early pancreatic multipotent progenitor cells.
The HASTER promoter activates HNF1A in cis Next, we explored how HASTER exerts positive and negative regulation of HNF1A, first focusing on the positive regulatory function. To assess whether HASTER acts in cis or trans, we bred compound heterozygous Hnf1a +/− ;Haster +/− mice. Single heterozygous Haster +/− or Hnf1a +/− mice do not develop hyperglycaemia 21 (in contrast with human HNF1A heterozygous mutations, which cause diabetes) (Fig. 5a). Remarkably, compound heterozygous Hnf1a +/− ;Haster +/− young mice developed severe fasting and fed hyperglycaemia with hypoinsulinaemia, but otherwise did not exhibit extra-pancreatic manifestations observed in homozygous Hnf1a-mutant mice 24,26 (Fig. 5a). This was accompanied by absent HNF1A expression in most β cells of 10-week-old Hnf1a +/− ;Haster +/− mice (Fig. 5b). Because the wild-type Haster allele was not able to activate the wild-type Hnf1a, which was located on the alternative chromosome, this shows that Haster positively regulates Hnf1a in cis in islet cells. We also created hybrid-strain mice with a heterozygous Haster null allele and found decreased islet Hnf1a mRNA from the chromosome carrying the Haster null allele (P < 0.02) (Fig. 5c). Genetic experiments thus showed that Haster acts in cis to maintain Hnf1a expression in islet β cells. Next, we examined whether HASTER transcriptional elongation, its RNA products or the promoter DNA are required to prevent HNF1A silencing. To this end, we created an allele with a transcriptional termination signal downstream of Haster (Haster stop ; Fig. 5d). We bred this Haster stop allele in a hybrid-strain background and performed RNA-seq for strain-specific quantitation of Hnf1a mRNA in islets. As expected, we found severely diminished Haster transcripts from the Haster stop allele (93% reduction; Wilcoxon rank-sum; P = 0.02). However, we still detected abundant Hnf1a exon 1 transcripts from the stop allele (Fig. 5d). Thus, whereas deletion of the Haster promoter DNA caused islet cell Hnf1a silencing in cis, this was not recapitulated by blocking Haster transcription. This indicates that the Haster promoter, but not transcriptional elongation or RNAs, is an essential positive cis-acting element of Hnf1a in pancreatic islets.

HASTER inhibits HNF1A in cis
Next, we examined how HASTER exerts negative regulation of HNF1A. To assess whether this function also occurs in cis, we again examined Hnf1a +/− ;Haster +/− mice, but this time focused on liver, where Haster deficiency causes uniformly increased HNF1A expression. Compound heterozygotes showed increased HNF1A in hepatocytes, indicating that increased expression of the Hnf1a + allele from the chromosome carrying the Haster deletion could not be compensated in trans by the Hnf1a − ;Haster + allele (Fig. 5e). Interestingly, pancreatic acinar cells showed similar behaviour to hepatocytes in compound heterozygotes, with increased HNF1A expression (Fig. 5b). We also examined Haster +/− mice bred on a hybrid-strain background and found that liver Hnf1a mRNA was selectively increased in Haster mutant chromosomes (Fig. 5f). Both findings showed that Haster-dependent inhibition of HNF1A, like its activating function, occurs in cis.
The HASTER promoter, but not its RNA, is essential for HNF1A inhibition Next, we examined the role of HASTER transcriptional elongation, RNA molecules or its promoter in this cis-inhibitory function. Hybrid-strain mice heterozygous for Haster stop showed that transcriptional blockage did not cause increased liver Hnf1a exon 1 transcripts in chromosomes carrying the stop allele (Fig. 5g). To further examine the role of the HASTER promoter versus transcripts, we generated clonal EndoC-βH3 cell lines with homozygous HASTER promoter deletions encompassing both transcriptional start sites (HASTER ΔP/ΔP ) or a 320-bp deletion of the P1 promoter (HASTER ΔP1/ΔP1 ) (Extended Data Fig. 9a,b). Both deletions caused increased HNF1A mRNA (Extended Data Fig. 9a,b), recapitulating the phenotype of mice in which Haster was excised after the formation of β cells (Extended Data Fig. 6k). To study the role of HASTER transcription, we targeted deactivated Cas9 to the HASTER transcriptional start site (CRISPR interference (CRISPRi) roadblock 41 ) or to a control intronic region located between HASTER and HNF1A promoters (Fig. 5h). Expectedly, targeting the HASTER promoter suppressed the formation of HASTER RNAs, although it did not influence HNF1A mRNA or HNF4A, an HNF1A-dependent transcript 34 (Fig. 5h). Similarly, degradation of HASTER nuclear transcripts using GapmeRs did not affect HNF1A or HNF4A mRNAs (Extended Data Fig. 9c). Conversely, CRISPR-dCas9-SAM activation of HASTER transcription in mouse or human β cell lines led to greater than fivefold levels of HASTER RNA without changing HNF1A or HNF4A mRNAs ( Haster +/− mice, n = 11 Hnf1a +/− mice and n = 13 Hnf1a +/− ;Haster +/− mice) and reduced insulin secretion (right; n = 5 mice per genotype) in Hnf1a +/− ;Haster +/− compound heterozygotes. The data are presented as means ± s.d. Statistical significance was determined by two-tailed Student's t-test. b, Immunofluorescence showing normal HNF1A in Hnf1a +/− islets and no expression in most islet cells from adult Hnf1a +/− ;Haster +/− mice (n = 1 per genotype). Solid arrowhead: HNF1A high acinar cell. Hollow arrowhead: HNF1A low β cell. Scale bar, 50 µm. c, Allele-specific Hnf1a mRNA in islets from hybrid-strain mice carrying the Haster mutation in the C57BL/6 chromosome. Hnf1a was quantified by strain-specific qPCR and normalized to Tbp (n = 4 mice per genotype). The data are presented as means ± s.d. Statistical significance was determined by two-tailed Student's t-test, d, Strain-specific RNA-seq analysis from Haster +/stop and Haster +/+ PWK/ PhJ;C57BL/6 hybrid islets (n = 4 mice per genotype). RPM, reads per million reads. e, HNF1A overexpression in liver from Hnf1a +/− ;Haster +/− mice (n = 1 per genotype). Scale bar, 50 µm. f, Allele-specific Hnf1a mRNA in liver from Haster +/− hybridstrain mice carrying the Haster mutation in the C57BL/6 chromosome. Hnf1a was quantified with strain-specific assays and normalized to Tbp (n = 4 mice per genotype). The data are presented as means ± s.d. Statistical significance was determined by two-tailed Student's t-test. g, Strain-specific RNA expression from Haster +/stop C57BL/6;PWK/PhJ hybrid mice, showing that reducing Haster elongation in liver failed to increase Hnf1a expression from the same C57BL/6 allele. The graphs show reads per million (RPM) (means ± s.d.). h, Targeting dCAS9 to the HASTER transcriptional start site blocked HASTER transcription in EndoC-βH3 cells but did not affect HNF1A or HNF4A mRNAs (n = 3 lentiviral transductions). i, CRISPR-SAM HASTER activation in EndoC-βH3 cells did not affect HNF1A and HNF4A (n = 3 lentiviral transductions). In h and i, the data represent normalized expression levels (means ± s.d.) and statistical significance was determined by two-tailed Student's t-test. The observation that HASTER transcriptional activation was not essential was unexpected because our genetic findings showed a tight correlation between HNF1A-dependent HASTER transcription and negative regulation of HNF1A. To reconcile these findings, we activated HASTER through lentiviral doxycycline-inducible overexpression of HNF1A (Fig. 6a). As in the CRISPR-dCas9-SAM experiments, this led to increased HASTER, but this time we observed a tenfold decrease of endogenous HNF1A mRNA (Fig. 6a). Importantly, the inhibitory effects of HNF1A overexpression were almost completely suppressed after deletion of the HASTER promoter region (Fig. 6b). Therefore, these studies showed that inhibition of HNF1A was triggered selectively by HNF1A interactions with HASTER promoter DNA, but not by various other manoeuvres that influenced HASTER transcription.

Uncoupling of HNF1A negative autoregulation and transactivation
To further establish whether HNF1A-dependent inhibition of its own promoter was dependent on its ability to activate HASTER transcription, we selectively modified the transactivation function of HNF1A. To this end, we examined the sequence of the transcriptional activation domain of HNF1A and identified an intrinsically disordered region (IDR); IDRs have been implicated in transcriptional activation through phase separation 42 . A selective deletion of this IDR led to decreased HNF1A-dependent HASTER transcription, but did not prevent inhibition of HNF1A (Fig. 6c,d). We also examined HNF1B, a paralogue with the same sequence recognition specificity. We found that while HNF1B is a weaker inhibitor of HNF1A than HNF1A itself, fusion of HNF1B to an unrelated IDR from the FUS protein increased HASTER activation, yet did not have a significant impact on HNF1B-dependent HNF1A inhibition (Fig. 6c,d). Therefore, the HASTER promoter is required for HNF1A-dependent transactivation of HASTER, as well as for HNF1A autoregulation, but these are two separable molecular mechanisms.

HASTER restrains HNF1A enhancer spatial interactions
Next, we examined whether HASTER function entails changes in the local histone modification landscape. Chromatin from control liver expectedly showed localized H3K4me3 enrichment surrounding Hnf1a and Haster promoters. In contrast, Haster LKO H3K4me3 showed spreading from the Hnf1a promoter to an intronic E enhancer region (Fig. 7a). H3K4me3 was therefore significantly increased in this E region, as well as in an upstream CTCF-bound (C) region (t-test; P < 0.05) (Fig. 7b). This spreading of H3K4me3 in Haster LKO suggested that Haster might insulate the Hnf1a promoter from the intronic E enhancer, while an increase in H3K4me3 at the E and C regions in Haster LKO suggested that Haster might influence the proximity of E and C regions with the H3K4me3-rich Hnf1a promoter. We therefore hypothesized that the HASTER promoter could inhibit HNF1A by modulating three-dimensional (3D) chromatin contacts of HNF1A with local regulatory elements.
To test this, we performed quantitative chromosome conformation capture using unique molecular identifiers (UMI-4C) 43 . Mouse Hnf1a and Haster promoters, as well as the intronic E enhancer region, are all located within ~7 kb. To increase the ability to capture 3D chromatin interactions with the Hnf1a 5′ region, we selected one viewpoint ~6 kb upstream of Hnf1a, near the CTCF-bound C site (viewpoint 1) and another at the Hnf1a promoter (viewpoint 2) (Fig. 7a). UMI-4C experiments from Haster LKO versus control liver (n = 6 per genotype) showed that the Haster deletion caused greater than twofold increased contacts between both Hnf1a upstream regions and the intronic E enhancer (V1; χ 2 test for pooled UMI-4C libraries; P = 0.02) (Fig. 7a,c and Extended Data Fig. 10a). Thus, the analysis of two viewpoints showed consistent changes in interactions between the Hnf1a upstream region and the intronic E enhancer in Haster LKO (Fig. 7d).
Likewise, we examined human EndoC-βH3 cells that had an intact or deleted HASTER promoter region and used the HNF1A promoter as a viewpoint for quantitative UMI-4C analysis. We found that HASTER deletions caused increased interactions between the HNF1A promoter and E regions (χ 2 test; P = 0.04; pooled UMI-4C libraries from four experiments). Next, we asked whether HNF1A binding to HASTER can modulate such interactions. HNF1A overexpression using the doxycycline-inducible system expectedly decreased endogenous HNF1A mRNA and significantly decreased interactions between the HNF1A promoter and the E region in HASTER +/+ cells (χ 2 test; P = 0.05) (Fig. 7e,f and Extended Data Fig. 10b-d). This effect required an intact HASTER promoter, as no significant HNF1A-dependent 3D contact differences were observed in HASTER mutants (χ 2 test; P = 0.78) (Fig. 7f and Extended Data Fig. 10b-d). Out of 33 enhancer-like regions in 1 megabase surrounding HNF1A, only E showed significant HNF1A-dependent changes (Extended Data Fig. 10b). Therefore, these results indicate that HNF1A overexpression limits 3D contacts between HNF1A and an intronic enhancer region, and this effect requires the HASTER promoter.
These findings imply that HASTER inhibition of HNF1A transcription involves modulation of interactions between HNF1A and the intronic E enhancer. Consistently, E deletions prevented increased HNF1A mRNA after deleting HASTER, but did not cause significant changes when HASTER was intact ( Fig. 7g and Extended Data Fig. 10e,f). Taken together, these experiments show that HASTER-dependent negative feedback of HNF1A occurs through a cis function of the HASTER promoter that does not require HASTER transcription. Instead, HNF1A binding to HASTER modifies the local 3D chromatin landscape and insulates HNF1A from cis-acting intronic regulatory elements (Fig. 7h).

Discussion
These studies have uncovered a cis-regulatory element that senses HNF1A concentrations and feeds back on HNF1A to ensure appropriate cell-specific expression levels (Fig. 7h). This is achieved through a dual activating and inhibitory function that is fundamentally different from conventional cis-acting enhancers or silencers that provide spatiotemporal ON or OFF switches, respectively (Fig. 7i).
We show that HASTER's dual function emanates from a 320-bp promoter DNA sequence and does not require transcription. However, it remains possible that transcripts have additional effects that were not explored. HASTER's inhibitory function was triggered by high concentrations of HNF1A, which modified HNF1A promoterenhancer interactions (Fig. 7h). The activating function of HASTER is reminiscent of an intronic enhancer, because it activates transcription in cis, and has lineage-specific essential role in pancreatic endocrine cells, plausibly due to cis-regulatory redundancy in other cell types. This dual HASTER function was most compellingly illustrated by the pancreatic knockout phenotype, in which lack of Haster enhancer-like activity led to HNF1A silencing in some β cells, while lack of negative feedback caused overexpression in other β cells that succeeded in activating HNF1A.
HASTER-dependent feedback was critical to ensure that HNF1A selects appropriate binding sites in different cell types. Interestingly, a few lncRNAs have recently been shown to negatively regulate nearby genes through different mechanisms, including the heart transcription factor gene Hand2 (refs. 12,44 ), the c-MYC oncogene 12 or CHD2 (ref. 15 ). All such genes-HAND2, MYC and CHD2, as well as HNF1A-share in common that they are haploinsufficient and encode transcriptional regulators 15,18,45,46 . Furthermore, c-MYC, HAND2 and HNF1A have been used in misexpression systems for lineage reprogramming-a feature of transcription factors that can act on repressed chromatin 36,47 . These examples, and perhaps most clearly HASTER's dual function, suggest that the principal function of a group of cis-acting lncRNA units may be to stabilize dosage-sensitive genes that encode proteins that have a capacity to transform cell-specific chromatin landscapes. Our studies exemplify a genetic defect in a mammalian lncRNA promoter that causes an in vivo physiological phenotype. Remarkably, the main manifestation of homozygous germline Haster mutations was β cell dysfunction and diabetes. HNF1A heterozygous mutations also cause selective β cell dysfunction and only subclinical alterations in other cell types 18 , but homozygous Hnf1a mutations cause severe liver and renal dysfunction, growth retardation, diabetes and embryonic lethality 21,24 . The discovery of a transcriptional stabilizer of HNF1A that has a selective function in β cells therefore provides a lead to dissect cell-specific genetic mechanisms underlying HNF1A haploinsufficient diabetes. It is also relevant for efforts to modulate HNF1A function in β cells.   Fig. 7 | HASTER remodels enhancer-HNF1A interactions. a, Haster LKO liver shows increased contacts between Hnf1a upstream viewpoints and the intronic E enhancer. UMI-4C contact trends with binomial standard deviation for the V1 and V2 viewpoints are shown (n = 6 for the wild type and n = 3 for mutant livers). Triangles denote viewpoints (DpnII fragment ± 1 kb) and asterisks mark E. The bottom panel shows liver H3K4me3. The brown shading shows the region deleted in Haster LKO . b, UMI normalized counts at E showed increased contacts with upstream regions (V1 and V2) in Haster LKO liver. Statistical significance was determined by 2 tests for n = 6 wild-type and mutant livers (V1) and n = 3 wildtype and mutant livers (V2). c, Haster LKO cells have increased H3K4me3 in C and E (n = 3 biological replicates). The data are presented as means ± s.d. Statistical significance was determined by two-tailed t-test. d, Schematic depicting increased Hnf1a promoter-E interactions in Haster LKO liver. e,f, Doxycyclineinduced HNF1A overexpression in HASTER-deleted EndoC-βH3 cells (n = 4) showing (e) normalized HNF1A mRNA levels and (f) HNF1A promoter viewpoint (triangle) UMI-4C contacts. The green shading shows a 5-kb region centred on E that was used to quantify HNF1A promoter interactions. Normalized UMI counts and 2 test P values calculated with umi4c are shown on the right. g, E deletions prevent HNF1A increases in HASTER-deleted cells. HASTER +/+ or HASTER ΔP1/ΔP1 clones were used to create polyclonal cells containing a mix of homozygous and heterozygous E deletions (ΔE) or wild-type sgGFP controls (WT). HASTER and HNF1A RNAs are shown as the fold change relative to parental HASTER +/+ or HASTER ΔP1/ΔP1 cells. ΔE significantly reduced HASTER but not HNF1A in HASTER +/+ cells, yet it reduced HNF1A in HASTER ΔP1/ΔP1 cells. Identical results were observed with a different clone, whereas C mutations had no effect (Extended Data Fig. 10f) (pool of n = 3 independent experiments with three pairs of sgRNAs for each deletion). In e and g, the data are presented as TBP-normalized relative expression (means ± s.d.) and statistical significance was determined by two-tailed t-test. h, HASTER exerts negative and positive feedbacks. At low HNF1A concentrations, HNF1A promoter-E interactions and transcription are unhindered, whereas at high HNF1A concentrations, HNF1A binding to HASTER limits HNF1A-E contacts, thereby decreasing HNF1A transcription. HASTER also acts as an essential enhancer in pancreatic lineages. i, HASTER is distinct from classic enhancers or silencers and is instead a cis-acting stabilizer that prevents overexpression and silencing. https://doi.org/10.1038/s41556-022-00996-8 Finally, this finding has general implications for our understanding of non-coding genome defects in disease. Unlike transcriptional enhancers, which often form clusters that provide robustness to genetic disruption 48,49 , our findings indicate that the 320-bp HASTER promoter region lacks functional cis-regulatory redundancy. This warrants a need to examine lncRNA promoter sequence variation in human genomes.
Lines with LoxP alleles without Cre, Cre lines without LoxP alleles and wild-type littermates served as controls, as indicated. Experimental cohorts were maintained on a 12 h light/12 h dark cycle with free access to water and standard mouse chow. Before decapitation, mice were anaesthetized using isoflurane (Zoetis).

Glucose tolerance
Animals were fasted overnight and received intraperitoneal glucose injections (2 g kg −1 ) or were re-fed before blood glucose was collected at the indicated time points. Glucose was measured with a GlucoMen Aero 2K meter (Menarini Diagnostics). Plasma insulin was quantified with the Ultra Sensitive Mouse Insulin ELISA kit (Crystal Chem) using an Infinite M Plex (Tecan) plate reader. Standard curves were fitted using quadratic polynomial regression. Assays were performed in duplicate using 5 µl plasma from mouse tail, and mean values are reported.

Islet isolation
Islet isolation was performed as described 54 . Briefly, ice-cold collagenase P solution (1 mg ml −1 in Hanks' balanced salt solution (HBSS) buffer; Roche) was injected through the main duct. The inflated pancreas was dissected, incubated at 37 °C for 8 min with agitation, disaggregated by gentle suction through a needle, washed four times with cold HBSS with 0.5% bovine serum albumin (BSA) and resuspended in 7 ml 7:3 pre-cooled Histopaque 1077:Histopaque 1119 (Merck), then 7 ml HBSS with 0.5% BSA was layered on top. The gradient was centrifuged at 950g for 20 min at room temperature. The interphase containing islets was collected, washed three times with HBSS with 0.5% BSA and the islets were further enriched by aspiration under a stereomicroscope. Islets were cultured for 2 d in 11 mM glucose RPMI with 10% foetal calf serum and penicillin-streptomycin (1:100; Invitrogen) at 37 °C and under 5% CO 2 .
sgRNAs (20 nucleotides) for CRISPRi roadblock were designed within 100 bp downstream of the islet CAGE transcriptional start site using Cas-Designer (http://www.rgenome.net/cas-designer/) and cloned as described 55 . Briefly, oligonucleotides (Thermo Fisher Scientific) containing sgRNAs flanked by compatible overhangs were phosphorylated with T7 polynucleotide kinase (NEB) and annealed. Oligonucleotide duplexes were ligated into BbsI-or BsmBI-digested destination vectors. Ligated constructs were transformed into Stbl3 chemically competent Escherichia coli and clones were sequenced. For deletions, sgRNA pairs were cloned as described 56,57 . Briefly, a fragment containing the scaffold of sgRNA1 and the H1 promoter of sgRNA2 were amplified from the pScaffold-H1 donor (118152; Addgene) with primers containing the protospacer of the sgRNA1, sgRNA2 and BbsI restriction sites. The PCR fragment was digested with BbsI and ligated into the destination vector.

Reverse transcription quantitative PCR
RNA was prepared using an RNeasy Mini Kit (Qiagen) and DNAse I (Qiagen) and retrotranscribed with SuperScript III (Thermo Fisher Scientific) and random hexamers (Thermo Fisher Scientific). Quantitative PCR was performed with Universal Probe Library assays (Roche). Reactions were carried out in duplicate in a QuantStudio 12K Flex (Applied Biosystems) with 1× TaqMan Fast Advanced Master Mix (Thermo Fisher Scientific), 1 µM forward and reverse primers and 250 nM Universal Probe Library probe, or 1× TaqMan assay. Quantification was performed using standard curves, with duplicate means reported, normalized by TBP or RPLP0, as indicated. Oligonucleotides are listed in Supplementary Table 5.

Single-molecule fluorescence in situ hybridization
Single-molecule fluorescence in situ hybridization was performed as described 62 . A set of 48 probes (Supplementary Table 8 LGC Biosearch Technologies). EndoC-βH3 cells were grown on coated (2 µg ml −1 fibronectin and 1% extracellular matrix; Merck) coverslips. Cells were fixed in 4% formaldehyde for 2 min, washed with 1× PBS and permeabilized with 70% ethanol at 4 °C for >1 h. Probes were hybridized overnight at 37 °C in the dark with 10% formamide, 100 mg ml −1 dextran sulfate, 2× SSC and 12.5 µM probes. The following day, cells were washed for 30 min at 37 °C with 10% formamide and 2× SSC, followed by 30 min with 5 ng ml −1 4′,6-diamidino-2-phenylindole (DAPI). Coverslips were mounted using VECTASHIELD HardSet mounting media. Acquisitions were performed on a Zeiss Axio Observer inverted widefield microscope with light-emitting diode illumination. Z-stack acquisitions were taken with a 63× objective every 0.5 µm from a total depth of 40 µm and deconvoluted (Huygens Software) and maximal projections of whole stacks were used for counting (8-12 fields per sample).

Immunofluorescence
Embryos and adult tissues were processed for immunofluorescence as described 63 . Briefly, tissues were fixed in 4% paraformaldehyde overnight at 4 °C, then washed in PBS before paraffin embedding. Deparaffinized sections (4 µm) were incubated for 30 min in antibody diluent (Dako) with 3% normal serum from the same species as the secondary antibody, incubated overnight at 4 °C with primary antibody and then overnight at 4 °C with secondary antibody, then

3′ RACE
3′ RACE was performed as described 65 . Human islet RNA (240 ng was retrotranscribed with Q T primers using SuperScript III. Nested PCRs were performed with Q5 polymerase (NEB). The first PCR used one-twentieth of complementary DNA with a gene-specific forward primer 1 and a Q O reverse primer, while the second PCR used 1 µl of a 1:5 dilution of the first PCR with a gene-specific forward primer 2 and a Q I reverse primer. The resulting fragments were cloned and Sanger sequenced. Oligonucleotides are provided in Supplementary Table 5.

Chromatin immunoprecipitation
Liver was collected after perfusion of ice-cold PBS and minced with a razor blade. Minced liver (100 mg) or 100-500 mouse islets were incubated with 1% formaldehyde (Agar Scientific) for 10 min at room temperature, then one-tenth of 1.25 M glycine was added for 5 min at room temperature, pelleted at 800g and 4 °C for 3 min and washed twice with PBS. Aliquots containing 20 mg initial liver or all processed islets were snap-frozen and stored at −80 °C until use. Crosslinked samples were lysed using ice-cold 2% Triton X-100, 1% sodium dodecyl sulfate (SDS), 100 mM NaCl, 10 mM Tris-HCl pH 8, 1 mM EDTA pH 8 and 1× protease inhibitor cocktail for 15-20 min on ice. Chromatin was sonicated with a Covaris S220 Focused-ultrasonicator (2% duty factor; 105 W peak incident power; 200 cycles per bust; 16 min). Sheared chromatin was centrifuged at full speed for 10 min at 4 °C to remove debris and insoluble chromatin and the supernatant was transferred to a fresh low-binding tube. For liver, the chromatin equivalent of 5 µg DNA was used for one-histone-mark chromatin immunoprecipitation (ChIP) and 10 µg was used for transcription factor ChIP. Chromatin was diluted four times with ChIP Dilution Buffer (0.75% Triton X-100, Nature Cell Biology Article https://doi.org/10.1038/s41556-022-00996-8 0.1% sodium deoxycholate, 140 mM NaCl, 50 mM HEPES pH 8, 1 mM EDTA and 1× protease inhibitor cocktail) and 5% was used as input. Dynabeads Protein G (30 µl; Thermo Fisher Scientific) were blocked with BSA overnight at 4 °C. HNF1A antibody (10 µl; D7Z2Q; Cell Signaling Technology), 2 µg H3K27ac antibody (ab4729; Abcam) and 2 µg H3K4me3 antibody (15-10C-E4; Merck) or 2 µg H3K4me1 antibody (ab8895; Abcam) were added to 500 µl samples and incubated overnight with rotation at 4 °C. Magnetic beads (30 µl) were added to the samples and rotated at 4 °C for 2 h.
For ChIP-quantitative PCR (ChIP-qPCR), antibody-incubated samples were washed with low-salt wash buffer (1% Triton X-100, 0.1% SDS, 150 mM NaCl, 20 mM Tris-HCl pH 8 and 2 mM EDTA pH 8), high-salt wash buffer (1% Triton X-100, 0.1% SDS, 500 mM NaCl, 20 mM Tris-HCl pH 8 and 2 mM EDTA pH 8), LiCl wash buffer (0.25 M LiCl, 1% IGEPAL, 1% sodium deoxycholate, 10 mM Tris-HCl pH 8 and 1 mM EDTA pH 8) and three times with TE buffer. Elution was performed with 200 µl 1% SDS and 0.1 M NaHCO 3 for 30 min at room temperature. Samples were placed on a magnet and the supernatant was transferred to a new tube. RNase A (1 µl; Thermo Fisher Scientific) was added to the eluate and incubated for 30 min at 37 °C. Reverse crosslink was performed by adding 8 µl 5 M NaCl and 3 µl proteinase K (Thermo Fisher Scientific) and incubation was performed for 1 h at 55 °C and 1,200 r.p.m., then overnight at 65 °C and 1,200 r.p.m. DNA was purified using a MinElute PCR Purification Kit (Qiagen). Quantitative PCR was carried out in duplicates as described for reverse transcription qPCR. Allele-specific qPCR was performed using Custom TaqMan SNP Genotyping Assays. Enrichment was subsequently normalized by the input.
For ChIPmentation, washes and tagmentation were performed as reported 66 . Antibody-incubated samples were washed twice with RIPA-LS (10 mM Tris-HCl pH 8, 140 mM NaCl, 1 mM EDTA pH 8, 0.1% SDS, 0.1% sodium deoxycholate and 1% Triton X-100), twice with RIPA-HS (10 mM Tris-HCl pH 8, 500 mM NaCl, 1 mM EDTA pH 8, 0.1% SDS, 0.1% sodium deoxycholate and 1% Triton X-100), twice with RIPA-LiCl (10 mM Tris-HCl pH 8, 250 mM LiCl, 1 mM EDTA pH 8, 0.5% IGEPAL and 0.5% sodium deoxycholate) and once with 10 mM Tris-HCl pH 8. Beads were resuspended in 20 µl tagmentation solution (10 mM Tris-HCl pH 8, 5 mM MgCl 2 and 10% vol/vol dimethylformamide) containing 1 µl Tn5 (Illumina) and incubated at 37 °C for 10 min. The reaction was stopped with 1 ml ice-cold RIPA-LS for 5 min on ice. Beads were washed twice with RIPA-LS and twice with TE buffer and resuspended in elution buffer (10 mM Tris-HCl pH 8, 5 mM EDTA pH 8, 300 mM NaCl and 0.4% SDS). Proteinase K was added to the elution and incubated for 1 h at 55 °C and 1,200 r.p.m., then overnight at 65 °C and 1,200 r.p.m. DNA was purified using a MinElute PCR Purification Kit (Qiagen). To estimate the number of cycles required for library amplification, 2 µl of the elution was used for SYBR Green qPCR, using KAPA HiFi polymerase (Kapa Biosystems). The resulting Ct value plus 1 cycle was used as the number of cycles to amplify the library. Libraries were amplified from 20 µl of elution with KAPA HiFi polymerase and Nextera custom primers (Supplementary Table 5). DNA clean-up was performed with 1.8× volume and size selection with a 0.65× volume of AMPure XP beads (Beckman Coulter). Libraries were sequenced on a HiSeq 2500 using 1 × 50 bp reads.

ChIP sequencing
ChIP sequencing (ChIP-seq) reads were aligned with Bowtie 2 (version 2.3.5) on the GCRm38 genome and sorted using SAMtools (version 1.7). Alignment statistics are listed in Supplementary Table 2. Multi-mapped reads were discarded. Reads mapping to ENCODE blacklisted regions were removed using BEDTools (version 2.27.1) and duplicated reads were removed with Picard (version 2.6.0). Peak calling was performed using MACS2 (version 2.1.1) with an FDR (q value) threshold of 0.05. The --broad flag was used for histone modifications. The MACS2 bdgcmp function was used to generate the local Poisson test −log 10 [P values]. P value BedGraphs were converted to bigWig using bedGraphTo-BigWig. Differential binding was performed using DiffBind (version 2.8.0) on peaks called in at least two samples from any genotypes, using normalized read coverage from triplicates. Binding differences were determined at q ≤ 0.05. HNF1A neo-binding sites were defined as peaks observed in at least two Haster knockout samples (q ≤ 0.05), without significant peaks in any control sample and with average log 2 -normalized ChIP read counts of ≤2 in control samples. Activated promoters were similarly defined as H3K4me3 peaks (q ≤ 0.05) in two Haster knockout and no control samples, with log 2 normalized counts of <2 in controls and significant differential H3K4me3 enrichment (q ≤ 0.05) in Haster versus controls. Coverage was calculated using deepTools (version 3.0.2) computeMatrix and the average of the three replicates was calculated for each bin. Peak intersections were performed with pybedtools (version 0.8.0).

Motif analysis
Analysis of known and de novo transcription factor binding site motifs was performed using HOMER (version 3.12). Analyses were performed on the merge between overlapping consensus peaks defined by Diff-Bind, using a minimum overlap of 1 bp. Enrichment analysis of de novo transcription factor motifs was also performed with the findMotifsGenome.pl command on consensus peaks defined by DiffBind, for lengths of 8, 10 and 12 bp on the masked mm10 genome.

Assay for transposase-accessible chromatin with high-throughput sequencing
Reads were trimmed to remove adaptors using Trim Galore and aligned with Bowtie 2 (version 2.3.5) on the GCRm38 genome. Multi-mapped and duplicated reads were removed using Picard (version 2.6.0). Mitochondrial and ENCODE blacklisted region reads were discarded. For visualization, MACS2 bdgcmp was used to generate the local Poisson test −log 10 [P value] bedGraphs. BedGraphs were converted to bigWig using bedGraphToBigWig. The coverage was calculated using deep-Tools (version 3.0.2) computeMatrix for 1-kb windows with 10-bp bins.

RNA-seq
RNA from islets or liver was quantified with Qubit (Thermo Fisher Scientific) and verified with Bioanalyzer (Agilent). Libraries were prepared with a TruSeq Stranded mRNA Library Kit and sequenced on a HiSeq 4000 (2 × 75 bp reads). Reads were aligned to the GCRm38 genome with STAR (version 2.3.0). Transcript-level quantification was performed with Salmon (version 0.11) using GENCODE GCRm38 VM18 annotations (Supplementary Table 2). Gene-level normalization and differential expression were performed using the Bioconductor R (version 3.6.1) package DESeq2 (version 1.24.0), using adjusted P ≤ 0.05 as a cut-off for differentially expressed genes. Fold changes were adjusted with lfcShrink using the apeglm option 67 .
For differential expression, de novo transcripts from Haster LKO and control liver were assembled from RNA-seq using StringTie (version 2.0). Transcripts from Haster LKO and control replicates were merged in a single GTF file using gffcompare (version 0.10.1). Transcript quantification and differential transcript expression were performed using Salmon and DESeq2 as described above, using the merged Haster-LKO and control liver transcriptome as a reference. Transcripts with low abundance (mean normalized transcripts per million< 3) were discarded. To define transcripts with an HNF1A-bound promoter, a minimum overlap of 1 bp between the transcription start site and an HNF1A peak was required.

Gene set enrichment analysis and Enrichr
Gene set enrichment analysis (GSEA) was performed with GSEAPreranked (version 6.0; GenePattern) 70 on genes ranked by fold change, using default parameters over 10,000. Enrichments of functional annotations were performed with Enrichr 71 .

Tissue-specificity z score
Tissue-specificity z scores were calculated for each gene by taking the average normalized gene expression in tissue minus the mean of all Hnf1a-expressing tissues divided by the standard deviation of all Hnf1a-expressing tissues 72 .

Single-cell RNA-seq
Cultured mouse islets were dissociated with Accutase (Merck) for 15 min at 37 °C. Islet cell suspensions were centrifuged at 600g for 3 min and resuspended in culture medium with DAPI before FACS sorting to remove dead cells and doublets. After sorting, cells were centrifuged at 600g for 3 min and resuspended in PBS/0.04% BSA. Single-cell libraries were generated with a 10X Genomics Chromium Single Cell 3′ Reagent Kit v3 following the manufacturer's instructions. Libraries were sequenced on a HiSeq 4000 using 2 × 75 bp reads.

Single-cell RNA-seq analysis
Read alignments and UMI counts were performed with CellRanger (version 3.0.2) using the mm10 reference genome. Subsequent analyses were carried out with Seurat (version 3.0.1) 73 or scVI-tools (version 0.11.0) 74 .
For Seurat, cells with <500 genes or >5% mitochondrial genes were filtered out (Supplementary Table 2). UMI counts were normalized using SCTransform 75 . To define shared populations between controls and knockouts, we performed an integrated analysis on the three control and three knockout datasets 76 . Briefly, the 3,000 most variable genes were used to find anchors (SelectIntegrationFeatures function using 50 dimensions). The first 50 principal components were used for t-distributed stochastic neighbour embedding projection (RunTSNE function) and clusters were defined by graph-based unsupervised clustering (FindClusters function) with a resolution of 0.5.
For scVI analysis, cells with <1,000 or >6,000 genes or >5% mitochondrial genes were filtered out. UMI counts were normalized for library size and log transformed. Integration of control and knockout samples was performed using the top 2,000 variable genes. The scVI model was trained using ten dimensions of latent space and two hidden layers for the encoder and decoder neural network. The identification of HNF1A-deficient β cell clusters was robust to using Seurat or scVI (Extended Data Fig. 7).
Differential expression was performed with Seurat FindMarkers (min.pct = 0.1) for all combinations of controls versus knockouts. Wilcoxon rank-sum P values from the different combinations were combined using Fisher's method. Only genes with a consistent positive or negative fold change across all control or knockout combinations and with a combined P ≤ 0.05 were considered differentially expressed. All genes differentially expressed in endothelial cells were discarded. For differential expression, all β cell clusters with >250 cells were grouped in a single β cluster.
Seurat objects were exported as loom using as.loom of the loomR (version 0.

UMI-4C
UMI-4C was performed as described 43 with modifications. Liver from three samples per genotype was crosslinked with 2% formaldehyde for 10 min, as described for ChIP. EndoC-βH3 cells were fixed with 1% formaldehyde for 10 min. Frozen pellets of ~10 7 cells were thawed on ice and resuspended in 5 ml cold lysis buffer (50 mM Tris-HCl pH 7.5, 150 mM NaCl, 5 mM EDTA, 1% Triton X-100, 0.5% IGEPAL and 1× protease inhibitor cocktail). After isolation, nuclei were resuspended in 650 µl nuclease-free water, 60 µl DpnII buffer and 15 µl 10% SDS and incubated at 37 °C and 900 r.p.m. for 1 h, with an additional hour after the addition of 75 µl 20% Triton X-100. The chromatin was digested at 37 °C and 900 r.p.m. for 24 h using 600 U DpnII (R0543L; NEB) and the enzyme was inactivated by incubating at 65 °C for 20 min. Ligation was performed in a final volume of 7 ml with 60 U T4 DNA ligase (Promega) and incubated at 16 °C overnight. The efficiency of the digestion and ligation was assessed by gel electrophoresis. Chromatin was reverse crosslinked with 30 µl proteinase K (10 mg ml −1 ) overnight at 65 °C, followed by 45 min of incubation with 30 µl RNase A (10 mg ml −1 ) at 37 °C. The DNA was purified by phenol-chloroform extraction followed by ethanol precipitation and resuspended in 10 mM Tris-HCl pH 8. Then, 10 µg DNA was sonicated using an S220 Focused-ultrasonicator (Covaris) to obtain 400-to 600-bp fragments. The DNA was end-repaired with 10 µl NEBNext End Repair Mix (E6050L; NEB) in a final volume of 200 µl, incubated for 30 min at 20 °C, purified with 2.2× AMPure XP beads (Beckman Coulter) and eluted in 10 mM Tris-HCl pH 8. A-tailing was performed with 200 U Klenow Fragment (M0212M; NEB) in 100 µl 1× NEBuffer 2 with 1 nM dATP. 5′ ends were dephosphorylated at 50 °C for 60 min with 20 U calf intestinal alkaline phosphatase (M0290S; NEB). The DNA was then cleaned with 2× AMPure XP beads. Adaptors were ligated with 0.4 µM Illumina-compatible forked indexed adaptors (Supplementary Table 5) and 10 µM quick ligase (M2200; NEB) in 160 µl 1× quick ligation buffer (M2200; NEB) for 15 min at 25 °C. DNA was denatured at 95 °C for 2 min and cleaned with 1× AMPure XP beads. To generate UMI-4C libraries, two nested PCRs were performed using GoTaq polymerase (Promega) with a final primer concentration of 0.4 mM. The first PCR used the upstream bait primer (Supplementary Table 5) and Illumina universal primer 2 and amplification was performed for 20 cycles. The DNA was cleaned with 1× AMPure XP and used for the second PCR with the downstream bait primer (Supplementary Table 5) and Illumina universal primer 2 for 16 cycles. After the second PCR, the DNA was cleaned and size selected with 0.7× AMPure XP beads. The size distribution of the libraries was controlled by Bioanalyzer and libraries were quantified with a KAPA Quantification Kit (07960166001; Roche). Libraries were sequenced on a HiSeq 2500 using 2 × 125 bp reads or a NovaSeq S4 using 2 × 150 bp reads.

UMI-4C-seq analysis
Umi4cPackage (version 0.0.0.9000) was used as described 43 . FASTQ files from sequenced libraries were initially pooled by genotype. Paired-end reads were demultiplexed using fastq-multx from ea-utils (version 1.3.1). Reads were aligned and the number of UMIs extracted using p4cCreate4CseqTrack. A window of 1 kb around the viewpoint was removed from the analysis. 4C contact profiles from knockouts and controls were normalized for UMI coverage using the plotCompProf function and an adaptative smoothing method that controls window size so that no fewer than five molecules are included in each window. Assessment of differential contacts between knockout and wild-type 4C profiles in genomic regions of interest within a 0.5-megabase window surrounding the viewpoint was carried out using p4cIntervalsMean through a chi-squared test of normalized molecule counts.

Statistics and reproducibility
No statistical method was used to predetermine sample size. No data were excluded from the analyses. The investigators were not blinded to allocation during the experiments and outcome assessment.
The results are shown as mean or median values, with error bars representing the s.e.m. or s.d., as stated in the figure captions. The numbers of biological replicates for each experiment are stated in the figure captions. P values were calculated by 2 test, two-sided Fisher's exact test, unpaired two-tailed Student's t-test or Wald or Wilcoxon rank-sum tests, as reported in the figure captions. The Brown-Forsythe test was used to test the equality of variances. For UMI-4C comparisons, P values were calculated by 2 test using umi4c. Statistical analysis of other epigenomic data is described in the appropriate Methods sections.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Code availability
All of the custom code used in this study is available upon reasonable request.
Article https://doi.org/10.1038/s41556-022-00996-8 Extended Data Fig. 2 | HASTER transcripts localize to the nucleus. a, Relative subcellular expression of HASTER lncRNA in EndoC-βH3 cells, compared to control mRNAs (TBP and HNF1A) and the nuclear lncRNA MALAT1. Mean ± s.d., n = 3 biological replicates. b, Single molecule fluorescence in situ hybridization signals for HASTER. HASTER transcripts are almost exclusively observed in the nucleus (deconvoluted images). The inset shows a rare non-nuclear signal. (n = 5 independent experiments). c, Colocalization of single molecule fluorescent in situ hybridization signals for HASTER (exonic probes) and HNF1A (intronic probes for HNF1A) in human EndoC-βH3 cells. n = 496 cells. The degree to which HASTER and intronic HNF1A RNA molecules are located at the HNF1A locus can only be assessed when two HNF1A or two HASTER molecules are seen in the same nucleus. In all such instances HNF1A and HASTER were found to colocalize. Scale bar, 20 µm.