The NALCN channel regulates metastasis and nonmalignant cell dissemination

We identify the sodium leak channel non-selective protein (NALCN) as a key regulator of cancer metastasis and nonmalignant cell dissemination. Among 10,022 human cancers, NALCN loss-of-function mutations were enriched in gastric and colorectal cancers. Deletion of Nalcn from gastric, intestinal or pancreatic adenocarcinomas in mice did not alter tumor incidence, but markedly increased the number of circulating tumor cells (CTCs) and metastases. Treatment of these mice with gadolinium—a NALCN channel blocker—similarly increased CTCs and metastases. Deletion of Nalcn from mice that lacked oncogenic mutations and never developed cancer caused shedding of epithelial cells into the blood at levels equivalent to those seen in tumor-bearing animals. These cells trafficked to distant organs to form normal structures including lung epithelium, and kidney glomeruli and tubules. Thus, NALCN regulates cell shedding from solid tissues independent of cancer, divorcing this process from tumorigenesis and unmasking a potential new target for antimetastatic therapies.

M ost patients with cancer die as a result of metastasis 1 , the process by which cancer cells spread from the primary tumor to other organs in the body 2 . Blocking metastasis could markedly improve the survival of patients with cancer, but how this process is triggered within the complex cascade of tumorigenesis remains unclear 3 .
Because metastasis is thought to be a wholly abnormal process, restricted to malignant tissues, attention has focused on identifying genetic mutations as drivers of cancer metastasis. Although this research has unmasked genes that promote metastasis in mouse models and humans, including a variety of ion channels that induce a metastasis-like phenotype by altering the transmembrane voltage to induce changes in gene transcription [4][5][6] , so far no recurrent metastasis-specific mutations have been identified 2,3,7 .
Other cell functions implicated in the metastatic cascade include 'stem cell-like' multipotency and plasticity. Stem cell capacity has been ascribed to metastatic cancer cells because of their ability to reconstitute heterogenous malignant cell populations as metastatic tumors 8,9 . Epithelial mesenchymal transition (EMT) 2 -a type of cellular plasticity displayed during normal gastrulation and tissue healing-is also an established feature of the metastatic cascade 2,10 . What remains unclear is how cancers 'hijack' these normal cell functions to enable metastasis.
Here, we identify a single ion channel, NALCN, as a key regulator of epithelial cell trafficking to distant tissues. NALCN is responsible for the background sodium leak conductance that maintains the resting membrane potential. It regulates key functions in excitable tissues, for example, respiration and circadian rhythms [11][12][13] , and gain-of-function mutations in the gene are associated with neurological disorders 14 . However, little is known about the role of NALCN in nonexcitable tissues. We show that NALCN regulates the release of malignant and normal epithelial cells into the blood, and their trafficking to distant sites where they form metastatic cancers, or apparently normal tissues, respectively. We thereby demonstrate that the metastatic cascade can be triggered and operate independent of tumorigenesis. These observations have profound implications for understanding epithelial cell trafficking in health and disease and identify a novel target for antimetastatic therapies. and solute carriers were selectively downregulated in PROM1 + P1 KP -GAC cells ( Fig. 1a and Supplementary Table 1). Among these, NALCN-a leak sodium channel that contributes to the resting cell membrane potential and cell excitability 13,16,17 -was restricted in its expression to PROM1 + antral gland basal stem cells and downregulated in PROM1 + P1 KP -GAC cells (Fig. 1b). Among 10,022 human cancers within The Cancer Genome Atlas, nonsynonymous mutations in NALCN were enriched in gastric, colorectal, lung, prostate and head and neck cancers (Fig. 1c) 18,19 . These cancers also contained deletions, and nonsense and frameshift mutations, at a frequency very similar to those observed in TP53 in human cancer 20 , suggesting NALCN might be a tumor suppressor (Supplementary  Table 2).
To determine how nonsynonymous mutations might affect NALCN function in cancer, we used HOLE analysis 21 to predict their impact on the ion channel pore radius of NALCN embedded and relaxed within a 575-lipid 1-palmitoyl-2-oleoyl-sn-glycero-3 -phosphocholine bilayer in silico 12,22,23 . This model correctly predicted opening of the NALCN channel by 22 mutations known to confer gain-of-function 12 , and closure of the channel by two mutations that cause loss-of-function 11 (Supplementary Table 3). Nonsynonymous, cancer-associated mutations were clustered within the pore turret and voltage-sensing domains that regulate NALCN channel opening 11,12 : 75% (n = 147/196) of these mutations were predicted to close the channel (Fig. 1d,e and Supplementary Table 4). Mutations predicted to cause the greatest pore closure were enriched in the most advanced cancers (Fig. 1f). Furthermore, human GACs in which NALCN was mutated, upregulated genes associated with EMT 24 , metastasis and cell migration (Supplementary Tables 5 and 6).
Whole-cell voltage-clamp analysis of P1 KP -GAC cells showed a linear GdCl 3 -sensitive current to voltage steps in the ±80 mV range, as previously reported (Fig. 2a,b) 13 . Decreasing Nalcn expression in P1 KP -GAC cells eliminated the NALCN current, increased cell proliferation and conferred an EMT morphology and transcriptome on P1 KP -GAC orthotopic allografts (Fig. 2 and Supplementary  Tables 7,8). Conversely, increased Nalcn expression increased the GdCl 3 -sensitive current in P1 KP -GAC cells, decreased cell proliferation and conferred a hyperepithelialized morphology on allografts.
To test directly whether CZCs possess metastatic potential, we injected separate aliquots of 25,000 CZCs isolated from mice with P1 KP -PAC, P1 KP -GAC or V1 KP -IAC into the tail veins of immunocompromised mice. Within 75 d, all mice developed numerous ZSG + metastases in the lungs, liver, kidneys and/or peritoneum (Fig. 6d,e and Supplementary Table 19). Similar studies with increasing cell dilutions showed that as few as ten CZCs were required to generate metastasis ( Fig. 6f and Supplementary  Table 19). Thus, CZCs are highly enriched for CTCs that recapitulate the transcriptome of human CTCs and are shed into the peripheral blood through a process regulated by Nalcn.

NALCN-blockade causes systemic fibrosis.
Although P1 R Nalcn +/Flx (n = 118) and P1 R Nalcn Flx/Flx (n = 112) mice did not develop cancer, whole-body autopsy of these mice revealed severe kidney and skin fibrosis relative to P1 R Nalcn +/+ (n = 65) mice (Supplementary Table  25 and Extended Data Fig. 7). This pathology arose after ≥400 d and replicated that of gadolinium-induced systemic fibrosis (GISF), a debilitating condition manifested by severe organ fibrosis following administration of gadolinium-based contrast agents 37 . How gadolinium-based contrast agents cause GISF is unknown, but suggested mechanisms include tissue retention of gadolinium-based contrast agents and the mobilization and recruitment of bone marrow-derived fibrocytes 38 . Our data suggest strongly that blockade of the NALCN channel by gadolinium mobilizes epithelial cells in a variety of epithelial tissues that traffic to the kidney and other organs, eventually eliciting a fibrotic response, causing GISF.

Discussion
Developing antimetastatic therapies has proven difficult because targets in primary tumors that drive metastasis have proved hard to find 2 . By divorcing the process of CTC shedding from 'upstream' tumorigenesis, our data unmask manipulation of NALCN function as a promising new approach to block metastasis. In particular, drugs capable of reopening the NALCN channel might be effective antimetastatic therapies. Precedent for this approach is provided by drugs that open the chloride channel mutated in cystic fibrosis 39 . If successful, such agents may also be useful for treating GISF. It is important to note that our observations are based on deleting Nalcn from mouse tissues, whereas NALCN in human cancers is affected predominantly by nonsynonymous mutations. Although our in silico modeling suggests strongly that these cancer-associated mutations close the NALCN channel, it will be important to demonstrate this functionally by modeling nonsynonymous Nalcn mutations in vivo. These studies should also include testing in patient-derived xenografts of gastric, colon and other cancers to confirm that NALCN regulates trafficking of human as well as mouse cells.
Loss-of-function mutations in NALCN may also help explain various enigmatic features of human cancer. Metastases can emerge many years after removal of a localized cancer 40 , or in the absence of a primary tumor 41 . Loss of NALCN function in our mice caused an abundant and persistent shedding of cells that embed in distant organs, even in the absence of a primary tumor. Because human epithelial tissues contain fields of phenotypically normal cells that harbor oncogenic mutations 42,43 , then loss of NALCN function in these cells could provide a source of CTCs that form metastases in the absence of a primary tumor, or long after a primary tumor has been removed. It is likely that such cells would need to acquire additional mutations to form tumors at the metastatic site, compatible with the relative rarity of these phenomena. Our data may also explain why CTCs have been found in the bone marrow of patients who lack metastases. Although these cells could represent 'dormant' CTCs, as previously suggested 3 , equivalent to ntCZCs in our mice, they may be shed from nontransformed epithelia that have lost NALCN function, but not gained the ability to form metastatic tumors. Our serial analysis of CZCs in mice suggest that cell shedding following NALCN loss-of-function is a late, rather than early, event; although NALCN mutations could promote both linear and parallel progression models of cancer 44 .
Our data also provide clues as to how NALCN might regulate epithelial cell shedding. We observed upregulation of genes associated with EMT and invasion within 72 h of deleting Nalcn from normal gastric stem cells; suggesting that this channel might regulate gene transcription in a similar manner to that reported for calcium ion channels 6,45 . Our electrophysiology studies demonstrate that GAC cells possess a NALCN-mediated current. However, more detailed electrophysiology studies are required to determine the precise mechanism by which NALCN regulates gene expression and cell shedding and whether this involves maintenance of the resting membrane potential.
The development of renal and skin fibrosis reminiscent of GISF in aged Nalcn-deleted mice, pinpoint NALCN channel blockade as the likely cause of this debilitating condition. P1 KP mice succumbed to cancer well before the onset of organ fibrosis in P1 R mice, and Nalcn deletion in P1 R mice did not induce stomach, intestine, lung, pancreas or liver fibrosis-principal sites of primary and metastatic tumors in P1 KP mice. Thus, fibrosis is unlikely to have contributed to metastasis in Nalcn-deleted mice. However, because limited exposure to gadolinium can induce GISF in humans, it is a note of concern that gadolinium-contrast imaging of cancer patients could accelerate metastasis.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41588-022-01182-0. 46  Lentiviral production and transduction. Nalcn-shRNA lentivirus was produced as described previously 47 . Three shRNAs per target (two open reading frames one 3′-untranslated region) were cloned into pFUGWH1-RFPTurbo and cotransfected with plasmids pVSV-G and pCMVd8.9 into 293FT (Thermo Fisher Scientific, catalog number R70007) cells. NALCN cDNA (NM_052867) was from OriGene (catalog number RC217074). In total 2 × 10 4 gastric cells were mixed with lentiviruses (20 particles per cell) plated in Matrigel. Transduced red fluorescence + (shRNA) or green fluorescence + (cDNA) cells were sorted using a Becton Dickinson Aria II Cell Sorter (Supplementary Tables 26 and 28).

Culture of stomach stem cells. Gastric glands were isolated
Whole-cell electrophysiology. The NALCN channel current was measured as reported 48 . Whole-cell recordings were obtained from stomach tumor cells on 12-mm cover slips coated with Matrigel at a density of 25,000 cells per ml and superfused (2-3 ml min −1 ) with warm (30-32 °C) recording solution containing 120 mM NaCl, 5 mM CsCl, 2.5 mM KCl, 2 mM CaCl 2 , 2 mM MgCl 2 , 1.25 mM NaH 2 PO 4 , 26 mM NaHCO 3 , 20 mM glucose and 1 M tetrodotoxin (300-310 mOsm), with 95% O 2 /5% CO 2 . Patch pipettes (open pipette resistance, 3-4 MΩ) were filled with an internal solution containing 125 mM CsMeSO 3 , 2 mM CsCl, 10 mM HEPES, 0.1 mM EGTA, 4 mM MgATP, 0.3 mM NaGTP, 10 mM Na 2 creatine phosphate, 5 mM QX-314 and 5 mM tetraethylammonium Cl (pH 7.4, adjusted with CsOH, 290-295 mOsm). Tetrodotoxin and QX-314 were included to block voltage-sensitive sodium channels in recorded cells, whereas cesium and tetraethylammonium Cl blocked voltage-sensitive potassium channels. Voltage-clamp recordings were made using a Multiclamp 700B (Molecular Devices), digitized (10 kHz; DigiData 1322A, Molecular Devices) and recorded using pCLAMP v.10.0 software (Molecular Devices). In all experiments, membrane potentials were corrected for a liquid junction potential of -10 mV. After forming a gigaseal onto a cell and rupturing the cell membrane, tumor cell membrane potential was held at −70 mV. Cell membrane capacitance, membrane resistance and pipette access resistance were then measured with the pCLAMP cell membrane test function. Recordings were excluded if pipette access resistance was higher than 20 MΩ or if access resistance changed by more than 20% during the experiment. After cell membrane resistance had stabilized, membrane potential was then stepped to 0 mV for 100 ms followed by a series of 250 ms voltage steps from −80 mV to +80 mV in 20-mV increments and the current response to these voltage steps was recorded. GdCl 3 (100 μM) was then applied to the bath solution to eliminate the voltage-independent 'leak' current associated with Nalcn. Calculation of the Nalcn current was performed offline by subtracting the current response in GdCl 3 from the previous GdCl 3 -free current recording. Tumor cell Nalcn current density was determined by dividing the Nalcn current by cell membrane capacitance. To verify successful expression of the RFP + (Nalcn shRNA ) or GFP + (NALCN cDNA ) construct, cells were imaged with two-photon laser scanning microscopy (Prairie Technologies) using a Ti:sapphire Chameleon Ultra femtosecond-pulsed laser (Coherent), and ×60 (0.9 NA) water-immersion infrared objective (Olympus). Red fluorescent protein was visualized using an excitation wavelength of 1030 nM, whereas green fluorescent protein (GFP) was visualized using an excitation wavelength of 820 nM (Supplementary  Tables 26 and 28). Generation of Nalcn Flx allele. Mice were derived from targeted embryonic stem cells (ESCs) (UCDAVIS KOMP Repository Knockout Mouse Project clone EPD0383_5_C01). ESCs were screened using KOMP PCR strategies for Nalcntm1a(KOMP)Wstsi. ESCs were implanted into recipient C57/Bl6 mice in accordance with protocols approved by IACUC-SJ. Wild-type Nalcn and Nalcn Flx alleles were detected using standard PCR and primers (UCDAVIS KOMP Repository Knockout Mouse Project clone EPD0383_5_C01). Nalcn RNA expression was quantified by quantitative PCR (qPCR) with reverse transcription and a Bio-Rad CFX96 Touch Real-Time PCR Detection System with primers (see Supplementary Tables 26 and 29-31 for details on animals and oligonucleotide sequences).

Tumorigenesis and surveillance.
All animal studies within the United Kingdom (UK) were performed under the Animals (Scientific Procedures) Act 1986 in accordance with UK Home Office licenses (Project License 70-8823, P47AE7E47, PP7834816) and approved by the Cancer Research UK (CRUK) Cambridge Institute Animal Welfare and Ethical Review Board. Mice were housed in individually ventilated cages with wood chip bedding and nestlets with environmental enrichment (cardboard fun tunnels and chew blocks) under a 12 h light/dark cycle at 21 ± 2 °C and 55% ± 10% humidity. Diet was irradiated LabDiet 5R58 with ad libitum water. Animals carrying the modified Nalcn allele were bred to RosaFLPe-expressing mice to remove LacZ and Neo cassette. Animals with complete recombination were bred with: Prom1C-L 29 ; Nestin-cre 49 ; Rosa-CreERT 50 ; villin-CreER 25 ; Pdx1-cre 28 ; RosaZSG 51 ; and KrasG12D/+ 52 , Trp53flx 53 . Cre-recombination was activated by dosing with 1 mg of tamoxifen per 40 g (body weight) at P3 or 8 mg tamoxifen per 40 g (body weight) at P60. Mice were maintained for up to 2 years and full-body autopsy was performed as described 4 at humane end points or the indicated time point, whichever was first. All tissues were inspected for macroscopic tumors with direct green fluorescence detection. Tissues were formalin fixed, paraffin embedded with portions also snap frozen or used for tissue dissociation for sequencing (Supplementary  Tables 26 and 29).  Tables 28 and 33). Single-channel images are shown in Supplementary Fig. 1 Nalcn RNA expression was detected in formalin-fixed, paraffin-embedded sections using the Advanced Cell Diagnostics (ACD) RNAscope 2.5 LS Reagent Kit-RED (ACD, catalog number 322150) and RNAscope 2.5 LS Mm Nalcn (ACD, catalog number 415168). Probe hybridization and signal amplification were performed according to the manufacturer's instructions. Fast Red detection of mouse Nalcn was performed was performed on the Bond Rx using the Bond Polymer Refine Red Detection Kit (Leica Biosystems, catalog number DS9390) according to the manufacturer's protocol. Whole-tissue sections were imaged on the Aperio AT2 (Leica Biosystems) and analyzed as for immunohistochemistry using HALO (Indica Labs) imaging analysis software. β-Galactosidase staining was performed exactly as described 4 (Supplementary Tables 26, 28 and 30).
Histological review, primary and metastatic tumor classification were performed by performed by expert pathologists (P. Vogel and B. Mahler-Araujo) blinded to mouse genotype and clinical history. The numbers of ZSG + cell clusters or metastases were counted in each organ in each mouse. Tissue fibrosis was assessed by expert pathologist R. Nazarian using sections stained with H&E, Masson's trichrome and Picrosirius Red.
Serial two-photon tomography imaging was performed on a TissueCyte 1000 instrument (TissueVision) in which a series of mosaic two-dimensional images are taken of the tissue, followed by physical sectioning with a vibratome and a subsequent round of imaging. This continues in an automated fashion, generating 15 μm serial two-photon tomography sections that can be mounted on standard microscopy slides, imaged by Axioscan fluorescence scanning (Zeiss) for section identification and realignment. Fiducial agarose marker beads labeled with GFP are distributed throughout the embedding medium to help in the realignment of the samples for consequent use (Supplementary Tables 26, 28 and 30).

Harvesting and injection of circulating ZSG cells.
Peripheral blood (500 µl to 1 ml) was harvested from mice at autopsy into 10 µl of 0.5 M EDTA, diluted in PBS and assessed by MACSQuant Analyzer (Miltenyi Biotech Inc.) for ZSG expression (525/50 nm (FITC) versus 614/50 nm (propidium iodide)). Cells for SCS and tail-vein injection were sorted using a BD FACSAria II Cell Sorter (BD Biosciences) excitation at 525/50 nm (FITC) versus 614/50 nm (propidium iodide). Nontamoxifen-induced mouse peripheral blood served as a negative control to set gate parameters ( Supplementary Figs. 2 and 3). Some 25,000 ZSG + cells were sorted and injected into recipient NOD SCID gamma mice (Charles River) and aged. For serial dilution assessment of tCZC metastasis initiation, tCZCs were isolated from donor tumor-bearing animals via FACS based on ZSG expression and placed into culture medium. Culture medium was as follows: Advanced DMEM/F12 (catalog number 31330038, Thermo Fisher Scientific), 2mM l-glutamine (catalog number 25030024, Thermo Fisher Scientific), B27 (catalog number 12587010, Thermo Fisher Scientific) and N2 (catalog number A1370701, Thermo Fisher Scientific), containing growth factors (50 ng ml −1 epidermal growth factor (PeproTech), 100 ng ml −1 basic fibroblast growth factor (catalog number 100-18c, PeproTech) and 1% FBS (catalog number 10500064, Thermo Fisher Scientific). Cells were grown at 37 °C in 5% CO 2 . Recipient NOD SCID gamma mice (Charles River) were injected with either 10, 100, 1,000 or 10,000 tCZCs via tail-vein injection and aged. Full autopsy and tissue harvesting were performed as described above. Full autopsy and tissue harvesting were performed as described above (Supplementary  Tables 26, 28 and 29).
Bulk RNA sequencing. Total RNA was extracted from tissues using Maxwell RSC miRNA Tissue Kit (catalog number AS1460, Promega). RNA quality was assessed using TapeStation System (catalog number 5067-5579, Agilent). RNA libraries and downstream sequencing were carried out as previously described 54 . The Illumina TruSeq stranded messenger RNA kit (catalog number 20020595, Illumina) was used to prepare RNA libraries and RNA quality confirmed using TapeStation (Agilent) and quantified using a KAPA qPCR library quantification kit for Illumina platforms (catalog number KK4873, KAPA Biosystems). Samples were normalized using the Agilent Bravo, pooled and sequenced on Illumina NovaSeq SP flowcell to generate single-end 50 bp reads at 20 million reads per sample.
Single-end 50 bp RNA reads were aligned to GRCm38 with HISAT2 (with default parameters). Each sample was sequenced across several lanes; per-lane BAM files were merged into per-sample BAM files. Quality control metrics were collected for each file, including duplication statistics and number of reads assigned to genes. Reads were counted on annotated features with subreads featureCounts, providing 'total' , 'aligned to the genome' and 'assigned to a gene' (that is, included in the analysis) counts. Percentages of aligned bases were computed for several categories: coding, untranslated region, intronic and intergenic. Other quality control metrics were the percentage of reads on the correct strand, median coefficient of variation of coverage, median 5′ bias, median 3′ bias and the ratio of 5′ to 3′ coverage. Quality control also included an expression heatmap drawn using log 2 -transformed counts. The log 2 -transformed counts were generated from normalized counts using the log2 function in R and counts function from DEseq2. Genes were regarded as displaying differential expression between sample cohorts if they displayed of ≥1 or ≤−1 log(fold difference) in expression levels with an adjusted P ≤ 0.05 (Supplementary Tables 26, 28 and 30).
Single-cell RNA sequencing. Animals were perfused with PBS followed by 100 U ml −1 of collagenase type IV in HBSS with Ca 2+ and Mg 2+ (Life Technologies) media containing 3 mM CaCl 2 . Whole organs were dissected, dissociated and placed into 2 ml of the appropriate dissociation buffer: lung and stomach were dissociated with 200 U ml −1 of collagenase type IV (Sigma) and 100 μg μl −1 of DNAse I (Roche) in HBSS with Ca 2+ and Mg 2+ (Life Technologies) media containing 3 mM CaCl 2 ; liver was dissociated with collagenase type I (100 U ml −1 ), dispase (2.4 U ml −1 ) DNAse I (100 μg ml −1 ) in HBSS with Ca 2+ and Mg 2+ (Life Technologies) media containing 3 mM CaCl 2 ; kidney was dissociated with papain (20 U ml −1 ) and DNAse I (100 mg ml −1 ) in DMEM high glucose, 2 mM l-glutamine (Life Technologies) with 1× Pen-Strep and 10% FBS; uterus and epididymis were dissociated with collagenase type I (100 U ml −1 ) and DNAse I (100 mg ml −1 ) in in HBSS with Ca 2+ and Mg 2+ (Life Technologies) media containing 3 mM CaCl 2 . Cells suspensions were filtered washed with HBSS without calcium and magnesium and centrifuged for 5 min at 300g at 4 °C for 5 min.
Single-cell suspensions of solid tissues were multiplexed and labeled with Cell Hashing conjugates: antimouse hashtags from 0301 to 0315 (BioLegend) before sequencing. All nucleated cells and ZSG + cells isolated from peripheral blood were not multiplexed but placed into a 10x Genomics pipeline. SCS libraries were prepared using Chromium Single Cell 3′ Library & Gel Bead Kit v.3, Chromium Chip B Kit and Chromium Single Cell 3′ Reagent Kits v.3 User Guide (manual CG000183 Rev A; 10x Genomics). Cell suspensions were loaded on the Chromium instrument with the expectation of collecting gel-bead emulsions containing single cells. RNA from the barcoded cells for each sample was subsequently reverse-transcribed in a C1000 Touch thermal cycler (Bio-Rad) and all subsequent steps to generate single-cell libraries were performed according to the manufacturer's protocol with no modifications (for most of the samples 12 cycles was used for cDNA amplification, 16 for samples with very low cell concentration). cDNA quality and quantity were measured with Agilent TapeStation 4200 (High Sensitivity D5000 ScreenTape) after which 25% of material was used for preparation of the gene expression library. Library quality was confirmed with Agilent TapeStation 4200 (High Sensitivity D1000 ScreenTape to evaluate library sizes) and Qubit 4.0 Fluorometer (Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific) to evaluate double-stranded DNA quantity). Each sample was normalized and pooled in equal molar concentrations. To confirm concentration pools underwent qPCR using KAPA Library Quantification Kit on QuantStudio 6 Flex before sequencing. Pools were sequenced on an Illumina NovaSeq6000 sequencer with the following parameters: 28 bp, read 1; 8 bp, i7 index; and 91 bp, read 2.
Raw RNA reads were processed with cellranger using mm10 from 10x as the reference genome to create filtered gene expression matrixes. Cell barcodes detected by cellranger were used as input to CITESeq for hashtagged sequence data (solid organs) generating a counts matrix with cell barcodes and hashtag oligo sequences per cell. The HTODemux function from Seurat was then used to identify clusters and classify cells according to their barcodes, including negative and doublet cells. Quality control metrics were generated using Scater followed by single-cell object conversion to Seurat objects, merging of objects and then analyses run using the standard Seurat pipeline (Supplementary Tables 26, 28 and 30). SCS profiles of human CTCs (GSE75367; GSE74639; GSE60407; GSE67980; GSE114704; GSE144494) and 500 cells from Illumina 10x for human PBMC raw counts were merged in python v.3.7.3 using the pandas library. Only common genes between datasets were analyzed. Seurat objects were created from PBMCs and CTCs. Following this step, data were analyzed using the standard Seurat pipeline (Supplementary Table 33).
For direct comparison of human CTCs and mouse tCZCs, 15,328 orthologs were identified and profiles processed through the standard Seurat workflow that includes a per-cell normalization of each gene expression count. Enrichment of a hemoglobin gene expression was carried out in UCell and enrichment scores generated with a two-tailed Mann-Whitney U statistic.

Statistics and reproducibility.
Clinical and mutation/CNA data were from the Cancer Genome Atlas via cBioportal, selecting for studies included as part of the Pan-Cancer Atlas 55 . Gene expression data was from Xenabrowser 56 . dN/dS was calculated using the dNdScv R package 18 . t-distributed stochastic neighbor embedding (t-SNE) was performed with the sklearn library in python using the 'BarnesHut' method, with a perplexity of 15, learning rate of 1,000 and 1,000 iterations. Contours were drawn as kernel density estimates of the density of nonsynonymous NALCN mutations for each cancer subtype with a significant (P ≤ 0.05) dN/dS score individually.
NALCN cryo-electron microscopy 12 structure 6XIW was downloaded from pdb, simulated in MemprotMD 22,23 , energy minimized using the steepest descents for 5,000 steps and converted to a MARTINI coarse-grained 57 representation embedded in a 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine bilayer with 575 lipids. The membrane was self-assembled by position restraining the protein and simulating for 200 ns to allow the membrane to form. The CG system was simulated for 1,000 ns before converting back to atomistic detail using CG2AT. The resultant atomistic membrane system was simulated in fully atomistic detail using the gromos53a6 forcefield for 400 ns. Mutational impact on pore size was calculated using HOLE 21 . Mutations were introduced into the simulated NALCN structure using the modeler mutation optimization protocol, and the resultant HOLE pore profiles aligned on their selectivity filters (Supplementary Table 28).
Spatial clustering of mutations was performed by calculating the distance between the center of mass of each pair of mutated residues (in the wild-type structure), and grouping residues into clusters with distances between any one pair of residues <12 Å. We calculated an expected distribution through randomly sampling the structure for the same number of mutations observed overall 100,000 times. Comparison of the observed clusters with the distribution of random samples was used to calculate a P value. All code generated for spatial clustering and analysis, with a workable example is available at: https://github.com/ shorthouse-mrc/NALCN. Unsupervised hierarchical clustering was performed using Morpheus (https:// software.broadinstitute.org/morpheus) and genset enrichment using g:Profiler (version e104_eg51_p15_3922dba). Tissue type deconvolution was performed using xCell (http://xCell.ucsf.edu/) (Supplementary Table 28).
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All the raw sequencing data have been deposited in the Gene Expression Omnibus with the following accession numbers: mouse RNA-seq of tumors and metastases (GSE210134) and mouse single cell RNA-seq of CZCs, PBMCs, tumors, metastases and solid tissues (GSE210134). Murine Prom1 + gastric mucosa and adenocarcinoma data GEO accession number: GSE78076. NALCN mutation and TSNE plot were generated with Pan-Cancer Atlas data from TCGA via cbioPortal and Xenabrowser. Cancer staging data were generated with Pan-Caner Atlas from TCGA and COSMIC data. NALCN structure 6XIW was from pdb. Human CTC and gene signature datasets are from the following GEO accession numbers: GSE75367, GSE74639, GSE60407, GSE67980, GSE114704, GSE144494. Human PBMC data are from Illumina 10x (10k Human PBMCs, 3' v3.1, Chromium X). Source data are provided with this paper.

Code availability
All bulk and single-cell analyses and visualizations were performed using R software (v.3.6.1), R studio (v.1.3.1093) and Python (v.3.9). Details on specific packages are included in the Methods section and also at https://github.com/ shorthouse-mrc/NALCN. No custom code was used for any part of the data processing or analysis.