Genetic evidence supports the development of SLC26A9 targeting therapies for the treatment of lung disease

Over 400 variants in the cystic fibrosis (CF) transmembrane conductance regulator (CFTR) are CF-causing. CFTR modulators target variants to improve lung function, but marked variability in response exists and current therapies do not address all CF-causing variants highlighting unmet needs. Alternative epithelial ion channel/transporters such as SLC26A9 could compensate for CFTR dysfunction, providing therapeutic targets that may benefit all individuals with CF. We investigate the relationship between rs7512462, a marker of SLC26A9 activity, and lung function pre- and post-treatment with CFTR modulators in Canadian and US CF cohorts, in the general population, and in those with chronic obstructive pulmonary disease (COPD). Rs7512462 CC genotype is associated with greater lung function in CF individuals with minimal function variants (for which there are currently no approved therapies; p = 0.008); and for gating (p = 0.033) and p.Phe508del/ p.Phe508del (p = 0.006) genotypes upon treatment with CFTR modulators. In parallel, human nasal epithelia with CC and p.Phe508del/p.Phe508del after Ussing chamber analysis of a combination of approved and experimental modulator treatments show greater CFTR function (p = 0.0022). Beyond CF, rs7512462 is associated with peak expiratory flow in a meta-analysis of the UK Biobank and Spirometa Consortium (p = 2.74 × 10−44) and provides p = 0.0891 in an analysis of COPD case-control status in the UK Biobank defined by spirometry. These findings support SLC26A9 as a therapeutic target to improve lung function for all people with CF and in individuals with other obstructive lung diseases.


INTRODUCTION
Cystic Fibrosis [CF (MIM:219700)] is a common life-limiting autosomal recessive genetic disease caused by pathogenic variants in the cystic fibrosis transmembrane conductance regulator (CFTR; MIM:602421). CF affects multiple organs; morbidity in the pancreas and intestine are seen at birth 1-3 , while progressive lung disease is experienced throughout the course of disease and is the major cause of morbidity and mortality. Variability in disease severity across the affected organs is due, in part, to variation in other genes, referred to as modifier genes. Modifier genes impact phenotypic expressivity in the presence of a dysfunctional major causal gene, for example, CFTR 1,4-7 .
CFTR is localized to the cell membrane of epithelial cells and functions as an ion channel. Over 400 variants have been reported to cause CF through variable effects on protein function 8 (http://cftr2.org). CF-causing mutations impact apical chloride and bicarbonate transport, altering hydration and pH of airway surface liquid resulting in viscous mucus. Accumulation of this viscous mucus leads to cycles of infection and inflammation, obstructing and damaging the airways, resulting in progressive lung damage and end-stage lung disease 9 .
Developments in CF therapeutics over the past decade have been transformative, altering the management paradigm from treating the downstream consequences of dysfunctional CFTR and delaying the progression of the disease, to treating the basic defect in an individual's CFTR itself: precision medicine. New drugs enhance CFTR function by directly targeting the different defects in the protein. For example, individuals with gating mutations have access to a highly effective modulator, ivacaftor (IVA), which is a potentiator that increases the opening probability of CFTR to 1 Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada. 2 Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada. 3 Program in Translational Medicine, The Hospital for Sick Children, Toronto, ON, Canada. 4 Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada. 5 Department of Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada. 6 Alberta Children's Hospital, Calgary, AB, Canada. 7 British Columbia Children's Hospital, Vancouver, BC, Canada. 8 The Children's Hospital of Eastern Ontario, Ottawa, ON, Canada. 9 The Children's Hospital, London Health Science Centre, London, ON, Canada. 10 Foothills Medical Centre, Calgary, AB, Canada. 11 Kingston Health Sciences Centre, Kingston, ON, Canada. 12 Centre de recherche de l'Institut universitaire de cardiologie et de pneumologie de Québec-Université Laval, Québec City, QC, Canada. 13 IWK Health Centre, Halifax, NS, Canada. aid chloride and bicarbonate ion transport in CF epithelia 10 . The most common CF causing allele Phe508del (c.1521_1523delCTT; p.Phe508del) 11 displays minimal CFTR at the apical membrane due to processing defects 5,6 , and once at the cell surface the Phe508del protein exhibits reduced opening probability and stability. Pharmaceuticals targeting this defect include a combination therapy of IVA and a CFTR corrector lumacaftor (LUM) or tezacaftor (TEZ) that improves the Phe508del processing to increase cell surface localized protein as well as channel gating. More recently, triple combination therapy of another corrector, elexacaftor, combined with tezacaftor and ivacaftor (ETI) has been approved in the United States (US) for individuals with at least one Phe508del aged 6 and above or other mutations responsive to Trikafta (https://www.cff.org/Trials/Pipeline/details/10150/Elexacaf tor-tezacaftor-ivacaftor-Trikafta). Elexacaftor stabilizes the protein within the cell membrane, resulting in greater improvements in lung function over LUM/IVA or TEZ/IVA alone 12,13 . ETI is now indicated by the FDA for 90% of individuals with CF (https://www. fda.gov/news-events/press-announcements/fda-approves-new-br eakthrough-therapy-cystic-fibrosis).
Although significant progress has been made in the development of pharmaceuticals for precision medicine in CF, several challenges remain. First, not all individuals with CFTR genotypes for which eligible pharmaceuticals are available respond to those treatments. Second, for those who do show a positive improvement in lung function, there is significant variability in the response [12][13][14][15] , which could be augmented with additional therapies. Third, there are loss of function alleles that cannot be addressed using the current paradigm, and therefore an alternative approach to therapy beyond potentiators and correctors of CFTR is necessary.
Several alternative approaches are being actively pursued, including CFTR gene restoration and/or correction or alternative targets 9 . Conceptually, alternative targets to CFTR aim to compensate for the abnormal dehydrated airway surface liquid that results from dysfunctional CFTR by modulating other ion channels, transporters and pumps [16][17][18][19][20] . This would provide therapeutic options for those individuals with genotypes that do not produce CFTR protein and could assuage limited responses to existing CFTR modulators.
The clinical success rate of drugs in development is appreciably higher when there is human genetic evidence that supports a drug target 21 . Genome-wide modifier gene studies in CF have aimed to identify genetic loci that impact disease severity in the presence of CFTR dysfunction in a hypothesis-free approach. Identified in a genome-wide association study (GWAS) of intestinal obstruction (meconium ileus) in CF 7 , the C allele of rs7512462 in intron 5 of Solute Carrier Family 26 member 9 (SLC26A9) (chr1:205899595, GRCh37) has since consistently demonstrated a beneficial effect for several CF co-morbidities, including in the exocrine 2,3 and endocrine pancreas 4,22 . These co-morbidities appear to originate from pre-natal dysfunction in the CF pancreas 1,3,23 . According to the Genotype Tissue Expression project 24 (GTEx; http://www.gtexportal.org/home/), rs7512462 is an expression quantitative locus (eQTL) for SLC26A9 where the C allele is associated with greater expression in the adult pancreas (p = 1.2 × 10 −5 , normalized expression effect size = 0.27). Rs7512462 and additional SLC26A9 eQTLs in the region of chr1: 205,806,897-206,006,897(GRCh37) colocalize with the meconium ileus CF GWAS summary statistics 1 , supporting gene expression variation is responsible for the observed CF GWAS finding.
Given the importance of attenuating progressive lung disease in CF, there was significant interest in investigating the contribution of SLC26A9 to lung function despite a lack of association evidence with rs7512462 across a broad CF population, including the largest CF GWAS of lung function to date 5 . SLC26A9 is an anion chloride channel/transporter in epithelial cells that contributes to constitutive apical chloride conductance and enhances cAMP-regulated CFTR currents [25][26][27][28] and Slc26a9 −/− /Cftr −/− mice show worse postweaning survival over Cftr −/− models 29 . Rs7512462 appears to associate with both the residual and activated current in lung cells 30,31 . In individuals with CFTR variants that maintain apical membrane localization, rs7512462 showed evidence of association with CF lung function 31 and the relationship was greatly enhanced upon treatment with the CFTR modulator IVA where the C allele was associated with greater lung function improvement 31,32 . That rs7512462 was not shown to be a strong modifier of CF lung disease 5,31 in the absence of treatment with CFTR correctors but is associated with lung function in a non-CF population study 9,33 is consistent with functional studies, where differential SLC26A9 interaction with wild-type CFTR and with Phe508del-CFTR in human bronchial epithelia (HBE) has been reported 25,31 .
The functional and CF population studies highlight the potential of SLC26A9 as a target to improve CF outcomes across the affected organs and to improve response to approved CFTR modulators. However, several questions remain including whether population studies support SLC26A9 as providing alternative chloride transport in individuals with genotypes for which there are no approved therapies; whether enhancing SLC26A9 will improve lung function response to CFTR modulators targeting the most common Phe508del variant; which patient-specific cellbased models of the airway demonstrate SLC26A9 expression; and whether enhancing SLC26A9 could benefit other obstructive lung diseases such as chronic obstructive pulmonary disease (COPD) in light of the rs7512462 association with lung function in non-CF population studies 33 .
Through continued recruitment into the Canadian CF Gene Modifier Study (CGMS; a registry-based observational genetics study) and in collaboration with the US Part B Cystic Fibrosis Foundation Therapeutics PROSPECT observational study of modulator treatment 34 (https://clinicaltrials.gov/ct2/show/NCT02477319), we investigate the association between rs7512462 and (1) lung function in individuals with CF with different CFTR genotypes including variants that result in no CFTR protein product (i.e. minimal function alleles); (2) individual lung function responses to CFTR modulators IVA and LUM/IVA; (3) CFTR function in primary cultured human nasal epithelia (HNE) from individuals with CF in response to approved and experimental CFTR modulators; and (4) other phenotypes in non-CF populations to inform the clinical utility of SLC26A9 modulators beyond applications in CF.

RESULTS
The CGMS has enrolled 3257 participants from 9 provinces and 35 clinics across Canada with a range of CFTR genotypes that are reflective of the Canadian CF population. To investigate the association between rs7512462 and pre-treatment lung function, we specifically study the subgroups of the CGMS who are: (1) homozygous for Phe508del; (2) carrying at least one copy of the G551D (c.1652G > A;p.Gly551Asp) variant or another gating variant approved for IVA; and (3) individuals with two minimal function (MF) variants in trans (https://www.cff.org/research-clinical-trials/ types-cftr-mutations) for which there are no CFTR modulators approved (these include nonsense, splicing and small indel variants) (Fig. 1). A subset of these individuals homozygous for Phe508del or having at least one gating mutation have been prescribed a CFTR modulator (Table 1; Fig. 1) and we investigate the lung function response to LUM/IVA in this CGMS subset and in an independent sample of 91 participants from PROSPECT, an observational study in the US (Supplementary Table 1 for sample  exclusion). Participants from the PROSPECT study were younger and healthier than the Canadian cohort (as measured by forced expiratory volume in one second (FEV 1 ) percent predicted (FEV 1pp )) at treatment initiation (Table 1).
Rs7512462 is associated with CF lung function in the absence of treatment Using the International CF Gene Modifier consortium lung phenotype, Saknorm 35 (Methods), we compared pre-treatment lung function across the three CFTR genotype groups (Fig. 1). Saknorm measures differ significantly between individuals with a gating variant (n = 89) versus those who are homozygous for Phe508del (n = 1,266; effect size = 0.20, p = 0.030), while individuals with two MF variants (n = 63) do not differ in Saknorm from Phe508del homozygotes (effect size = −0.11, p = 0.27).
In CGMS participants with at least one G551D variant (n = 79), rs7512462 did not reach statistical significance at the 0.05 level with Saknorm (effect size = 0.185, p = 0.225, Supplementary Table  2). Meta-combining the CGMS association evidence with association results from four independent cohortsa cohort from the French gene modifier study (FGS) with n = 49 participants 32 and three cohorts with n = 272 participants reported in Eastman et al. 36 results in an effect size of 0.11 with p = 0.05 and an effect size = 0.12 with p = 0.036 from inverse variance weighted and from sample size weighted meta-analyses, respectively (Supplementary Fig. 1; Supplementary Table 2). Meta-analysis of the rs7512462 Saknorm association from the FGS 32 and CGMS cohorts with at least one gating variant approved for ivacaftor have similar but more variable effect size than that from the G551D subset (effect size = 0.12 with p = 0.18; Supplementary Table 2).
In individuals from the CGMS who are homozygous Phe508del (n = 1266), each C allele is associated with an increase in lung function, albeit with a smaller effect size (effect size = 0.069, p = 0.028, Fig. 2a; Supplementary Fig. 2). Meta-analysis of this CGMS association with an rs7512462 Saknorm association in n = 1804 individuals who are Phe508del/Phe508del from the FGS 32 also show association in both inverse variance weighted (effect size = 0.050, p = 0.021) and sample size weighted (effect size = 0.046, p = 0.037) meta-analyses (Supplementary Fig. 2; Supplementary Table 2). Constructing haplotypes in the CGMS in individuals homozygous for Phe508del as in Lam et al. 37 does not provide evidence for significant low risk (LR) or high risk (HR) haplotypes contributing to Saknorm variation (Supplementary Table 3) and does not provide greater power than using the single SNP rs7512462 to mark the locus for association with lung function (Supplementary Note; Supplementary Table 3).
Importantly, in individuals with two MF variants, the CC genotype is associated with improved lung function when compared to individuals with the TT and TC genotype (effect size = 0.45, p = 0.0083, Fig. 2a). This is consistent with the hypothesis that for rs7512462 marking SLC26A9 activity, SLC26A9 may be providing alternative chloride transport properties in individuals with CFTR variants for whom no currently approved therapies exist.
When combining CGMS participants from the three genotype groups in a joint multivariable regression model to assess the association with rs7512462, the C allele is associated with greater lung function (p = 0.017 and 0.022 for additive and recessive models, respectively, Table 2). Fitting an interaction term in this multivariable model, the C allele does not demonstrate a statistically significant difference in effect size depending on one's CFTR genotype (interaction p-values 0.90 and 0.46 for additive and recessive models, respectively).
Rs7512462 C allele is associated with improved response to CFTR modulators Newly recruited participants into the CGMS on IVA (n = 23) were, on average, older and had worse baseline lung function than those included in Strug et al. 31 (Table 1). Despite the difference in baseline characteristics and disease severity, we do observe additional supportive evidence for improved lung function response when combining the newly and previously recruited participants (difference in FEV 1pp pre and post treatment initiation) in individuals with the CC genotype upon treatment with IVA (effect size = 9.9, p = 0.03, Table 3; Fig. 2b). Results from a linear mixed-effect model, which accounts for the longitudinal nature of the data, provides consistent results (effect size = 12.76, p = 0.025; Supplementary Table 4).
Through ongoing recruitment in the CGMS there were 104 participants prescribed LUM/IVA clinically (Table 1). In collaboration with the US PROSPECT observational study, we investigated lung function response to LUM/IVA alone (n = 91) and in a combined sample of n = 195 individuals homozygous for Phe508del stratified by rs7512462. Despite minimal clinical response to LUM/IVA reported on average 14 , we do observe a significant improvement in lung function response in those with the CC genotype (p = 0.002 in US PROSPECT and p = 0.006 for combined, respectively; Table 3), akin to observations for IVA ( Fig. 2b and 31 ), and in studies of improved CFTR function with the rs7512462 C allele in HNE 30 and HBE 18,31 cells obtained from individuals homozygous for Phe508del. Thus, the rs7512462 genotype shows improved response to the LUM/IVA combination therapy in cohorts of CF patients who were monitored observationally during their real-world experience with the approved modulators.
We investigated CFTR-mediated current in 36 cultured HNE from CGMS participants homozygous for Phe508del upon treatment with VX-770+VX-809 (as in Kmit et al. 30 , corresponding to the IVA/LUM combination; VX-770 applied acutely) and upon treatment with VX-770+VX-809+ an amplifier under experimental investigation 38 (Fig. 3). We used the HNE from the earliest passage available to us (P2) to align with Kmit et al. 30 . Defining the treatment response as the difference in ΔIeq -forskolin from DMSO vehicle to VX-770 + VX809 (n = 36) or VX-770+ VX-809+ amplifier (n = 30), we see a significant improvement in CFTR function in the CC group (effect size = −1.934, p = 0.0022, Table 4) in HNE with VX-770+VX-809+ amplifier. The increase in CFTR function in the CC group in the cultured HNE with VX-809 does not reach statistical significance (effect size = −0.406, p = 0.338, Table 4) unlike previous reports in HNE and in HBE models, although the direction of effect is consistent. This difference in effect size made us question the SLC26A9 gene expression profile across different lung tissue models, especially given the low passage of the primary HNE cultured cells studied by Kmit et al. 30 .

SLC26A9 expression differs across airway models
While investigating eQTLs in various airway tissue models, we observed that there is no evidence supporting rs7512462 as an eQTL in the lungs from GTEx v8 (p = 0.71) or from RNA obtained from naïve HNE of individuals with CF (p = 0.64, n = 79) despite its association with residual and activated current in Ussing chamber studies 30,31 . SLC26A9 expression appears generally low across several different lung model systems we investigated, with an average of 1.34 transcripts per million (TPM) (Supplementary Fig.  3a). We generated a resource that contains the transcriptomes from paired cultured and fresh naïve HNE and HBE tissue on the same individuals (Methods; GEO ID: GSE172232)). Of the primary lung cell models we investigated, the greatest expression is in the naïve HBE cells (TPM = 1.94; Supplementary Fig. 3a), and this expression level is 2.1-fold greater than in the naïve HNE (p = 0.04, paired analysis in n = 17 HNE-HBE naïve pairs). Unfortunately, naïve HBE cell models are not generally accessible for personalized medicine approaches 39 , and cultured models are the norm. Here cultured HBE show mean expression with TPM = 1.71 that is 2.5-fold greater than the CF cultured HNE (p = 0.02, paired analysis for n = 16 HNE-HBE cultured pairs), although there is some indication that a reduction in culturing time results in greater SLC26A9 expression in the HNE ( Supplementary Fig. 3b).
Using single-cell RNA-seq data catalogued in the Human Lung Cell Atlas 40 , we investigated the expression of SLC26A9 and CFTR across cell types. Within lung cells, both genes are expressed in the epithelial cells of the alveoli and airway, particularly within alveolar type 2 (AT2) cells and club cells (Fig. 4). These cell types are also supported by single-cell Human Protein Atlas data 41 (http://www.proteinatlas.org/), which reports expression in AT2 cells (normalized TPM of 50.1 for SLC26A9 and 65.6 for CFTR) and club cells (normalized TPM of 2.7 for SLC26A9 and 17.8 for CFTR). Furthermore, single-cell GTEx data 42,43 (http://www.gtexportal. org/home) shows concordant evidence where, SLC26A9 and CFTR reads are observed in an appreciable fraction of AT2 cells (24.59% CFTR, 12.41% SLC26A9) and club cells (28.79% CFTR, 5.45% SLC26A9). We then assessed the evidence for CFTR-SLC26A9 coexpression among the club, basal and alveolar epithelial cell types identified in the Human Lung Atlas study 40 . Both the Spearman's correlation and the zero-inflated negative binomial model indicated significant association between the CFTR and the SLC26A9 read counts for AT2 cells ( Table 5). The model estimate suggests positive co-expression between the two genes (log ratio = 0.00035, p = 0.00974, Table 5).

Phenome-wide Association Study (PheWAS) of rs7512462 and colocalization analysis
We used the GWAS ATLAS database 44 that includes 4756 GWAS from 473 unique studies with 3302 unique traits, and the UK Biobank resource to investigate other non-CF traits associated with rs7512462. The 10 traits with the smallest reported p-values are listed in Table 6, four of which are respiratory phenotypes from the UK Biobank and Spirometa consortium 33 : peak expiratory flow (PEF), forced expiratory volume in one second and forced vital capacity (FEV 1 /FVC) ratio. Saknorm, the lung function measurement used in the CF GWAS, is also calculated from FEV 1 5,35 . The list of significant phenotypes also includes our own CF modifier gene study where we identified rs7512462 as a modifier of meconium ileus, with evidence of an exocrine pancreatic origin 1 (Table 6). Interestingly, an earlier age at menarche (which is associated with weight) and a higher male waist circumference and waist-hip ratio are also associated with the rs7512462 C allele. The association of rs7512462 with type 1 diabetes and the weight-related phenotypes suggest that the role rs7512462 is marking in these reproductive and metabolic phenotypes may likewise trace back to an exocrine pancreatic origin. Significant colocalization analysis (column 'Colocalization P-value', Table 6) calculated using the Simple Sum 1,45 implemented in LocusFocus 46 between the meconium ileus CF GWAS statistics 1 and the summary statistics from the PheWAS associated traits supports that the same genetic variation contributes to the different traits. Airflow limitation is a key diagnostic feature of COPD 47 . In the UK Biobank, we analyzed n = 22,071 of 263,461 participants with moderate to severe airflow limitation defined by the Global Initiative for Obstructive Lung Disease (GOLD) criteria 48 . Although rs7512462 is a modifier of PEF and FEV 1 /FVC (p = 2.74 × 10 −44 and p = 4.11 × 10 −5 , respectively) from Shrine et al. 33 (Table 6), the evidence is not as conclusive for FEV 1 pp and COPD case-control status in the UK Biobank (p = 0.56, 0.0891, respectively; Fig. 5).

DISCUSSION
The availability of CFTR modulators is altering care for many individuals with CF, although variation in response is apparent, partially due to individual genetic backgrounds. One such genetic factor is SLC26A9, which contributes to early onset CF comorbidities in the pancreas and intestine 1,4,22,37 . Unlike for these early-onset  31 , the treatment response for IVA is defined as the difference between the mean posttreatment FEV 1pp within 15 to 400 days and FEV 1pp baseline. Response for LUM/IVA is defined as the difference in FEV 1pp between the first post treatment within 5 to 7 months to the baseline 65 (Methods). The center line, upper and lower box limits of boxplots correspond to the median, the third quartile (Q3) and the first quartiles (Q1), respectively; whiskers reflect the 1.5x interquartile range (IQR), i.e., bottom whisker is Q1-1.5xIQR and upper whisker is Q3 + 1.5 x IQR. Each dot represents an individual measurement.
phenotypes where the rs71512462 SNP is a SLC26A9 eQTL in the pancreas and SLC26A9 and CFTR appear to contribute to these phenotypes independently 1,7,31 , in the lungs the mechanism that is being marked by rs7512462 is not immediately obvious from the genetic data. Meanwhile, some have also questioned whether rs7512462 is even associated with lung function 36 .
To address the connection between rs7512462 and SLC26A9, we integrate evidence from population and functional studies. The rs7512462 SNP was shown to associate with lung function or lung response to CFTR-modulators in individuals that carry CFTR variants that result in apical membrane localization for CFTR 31 and this is further supported here. Although rs7512462 does not show eQTL evidence based on lung tissue expression, SLC26A9 is present in HBEs 28,49,50 (Supplementary Fig. 3) and Larsen et al. 28 show that SLC26A9 is a major contributor to constitutive anion secretion across HBEs. In Strug et al. 31 Fig. 3B prior to CFTR correction in Phe508del/Phe508del HBE monolayers, rs7512462 shows evidence of association with residual forskolin-stimulated current (P = 0.09, −0.17 µA/cm 2 ΔI eq -forskolin per C protective allele) in Ussing chamber studies. A relation between rs7512462 genotype and transepithelial current in uncorrected Phe508del CFTR was also shown by others in HNE cultures with Phe508del CFTR genotypes 30 .
Experimental evidence for a relationship between the rs7512462 genotype and CFTR function in airway cell models and here in CF participant responses was also observed by rescuing the traffic-defective Phe508del with corrector modulators 30,31 . Published in vitro and cell studies have demonstrated a CFTR-SLC26A9 interaction in lung cells and together with recent work 28,49,50 and references within, evidence is emerging that SLC26A9 may also stabilize CFTR. Therefore, we interpret that the contribution of SLC26A9 that rs7512462 is marking for lung phenotypes may be from both anion channel activity and enhancement of apical membrane-bound CFTR.
To address the uncertainty in evidence for rs7512462 association with lung function 36 , here we extend several genetic association studies. We investigate the role of rs7512462 genotype in CF patients with different types of CFTR mutations and in treatment response to CFTR modulators both in patient populations and in airway models. We align the findings and previously published work with gene expression patterns including understanding of cell type-specific SLC26A9 and CFTR co-expression, and now also consider the role of rs751242 in the phenome and, in particular, in lung function measurements in non-CF populations. Significant human genetic evidence that supports a role for SLC26A9 in CF disease severity and CFTR modulator response is accumulating. Here we provide a comprehensive investigation of the role of rs7512462 as a marker of SLC26A9 activity in CF and non-CF populations.
The relationship between the SLC26A9 rs7512462 marker and lung function in three types of CFTR genotypes was revealing. Individuals with the CF-causing G551D gating variant show association with rs7512462. Gating-deficient CFTR protein exhibits epithelial apical cell membrane localization with reduced opening probability, resulting in reduced epithelial chloride and bicarbonate secretion characteristic of CF 51 . In contrast, individuals homozygous for Phe508del, comprising the majority of CF cases, show more modest evidence of association with rs7512462. Phe508del-CFTR is rapidly degraded intracellularly with minimal surface membrane localization. Meanwhile, we demonstrated that individuals with MF variants do show rs7512462 association but with the absence of CFTR protein cannot respond to CFTR modulator therapy. The studies of MF alleles support that at least some aspect of lung function contribution by SLC26A9 can be independent of CFTR. These investigations align with several previous studies highlighting the potential for SLC26A9 to provide (alternative) chloride transport that functions independently of CFTR 25,26 .
CF GWAS of early-onset CF pancreatic and intestinal phenotypes were CFTR genotype independent [1][2][3]7 ; that is, those homozygous for Phe508del still demonstrated genome-wide significance with benefit from the C allele of rs7512462, a marker  of SLC26A9 gene expression in CF phenotypes that show pre-natal onset (for example, meconium ileus). This may reflect that SLC26A9 and CFTR may act independently at this stage of development and is consistent with the observation that Slc26a9 mRNA was high in the murine pancreas in the embryonic stages of development when Cftr was low 31 . Given the small sample size, the rs7512462 association with lung function in individuals with two MF alleles requires further investigation and independent replication. However, a consistent beneficial effect of the CC rs7512462 genotype across several of our independent studies and disparate outcomes provides support that modulation of SLC26A9 can provide alternative chloride transport and could be a therapeutic target to improve lung function in individuals with any CFTR genotype. The modulation of alternative channels, transporters, and pumps to compensate for dysfunctional CFTR 19,52 , and in particular SLC26A9 9,20,31 , would provide a mutation-agnostic approach and address the current unmet need of CF individuals with MF alleles. Given other published reports assessing the rs7512462 relationship with lung function in individuals with G551D variants 31,32,36 , we used meta-analysis to summarize the current state of the evidence. Together, the weight of evidence supports a relationship between the SLC26A9 marker, rs7512462, and lung function in individuals with a G551D variant where there is cell-surface localized CFTR protein, but reduced activity. In an expanded CGMS cohort, we also replicated the enhancing effect of the rs7512462 C allele upon the rescue of gating variants with IVA, suggesting that the pre-treatment effect may be reflective of constitutive activity and/or attributable to SLC26A9-aided residual CFTR function. Consistent with either is the suggestive association we observed between rs7512462 and lung function prior to treatment with LUM/IVA in individuals from the CGMS homozygous for Phe508del,  with a reduced effect size possibly reflective of reduced cell membrane localization and interaction. LUM/IVA was the first approved modulator for those homozygous for Phe508del, on the basis of reduced pulmonary exacerbations and lung function response. However, the average improvement in lung function in clinical trials 14 and observational studies 34 has been modest. To investigate whether the improvements in lung function from LUM/IVA are modified by the rs7512462 genotype, we also genotyped the DNA from participants of the US observational LUM/IVA study, PROSPECT (https:// clinicaltrials.gov/ct2/show/NCT02477319), and demonstrated that although modest in its average improvement in lung function, those with the CC rs7512462 genotype did exhibit the greatest benefit. This finding of rs7512462 impact on clinical response in a real-world setting of treated patients is consistent with published studies of CFTR function in primary HBE and HNE cells from individuals homozygous for Phe508del by us 31 and others 30 , respectively. Together these findings across the two treated CFTR genotype groups replicate and expand previous reports 25,26,30,31 that rs7512462 correlates with improved CFTR function.
We did evaluate CFTR function in nasal brushes from 36 individuals homozygous for Phe508del in Ussing chamber studies following 24 h exposure to vehicle (DMSO), VX-770 with corrector VX-809 (corresponding to IVA/LUM), and a combination of VX-770, VX-809, and an experimental amplifier 38 . We observed a similar trend to that reported in Kmit et al. 30 with VX-809, although it did not reach statistical significance. When CFTR function was, however, augmented with the amplifier, the difference in CFTR function pre-and post-treatment was greatest in the cells with the rs7512462 CC genotype. These functional studies further support the hypothesis that SLC26A9 will likely benefit any therapeutic situation with increased apical surface localized CFTR protein, such as for the latest highly effective modulator treatment, ETI.
Although the population studies provide important insights into the relationship between CFTR and SLC26A9 marked by rs7512462 in vivo, functional studies in cellular and animal models will be necessary to understand the functional relationship between the two channels/transporters 53 and how they may be working together. The expression studies presented here provide guidance on the use of cellular models for SLC26A9 and further highlight potential limitations of cultured HNE for the unique considerations of studying SLC26A9 and CFTR that are distinct from the ones for studies of CFTR alone. Compared to naïve bronchial cells, we observed greater variation and lower expression in naïve nasal cells. Furthermore, the duration of culturing time of either HNE or HBE cells resulted in reduced  40 . a Lung cell expression profiles from three individuals were clustered by nearest-neighbor and visualized using uniform manifold approximation and projection (UMAP). Normalized and log-scaled RNA counts for CFTR and SLC26A9 are plotted for each cell. Tissue compartment as defined by the Human Lung Cell Atlas is shown; expression of both CTFR and SLC26A9 is primarily restricted to epithelial cells, b UMAP clustering of epithelial lung cells only; cell types were annotated in the Human Lung Cell Atlas. c Breakdown of CFTR and SLC26A9 expression by cell type. Cells were grouped as having CFTR read count > 1 (red), SLC26A9 > 1 (blue) or co-expressing both (purple). Cell total varies and is annotated for each cell type.
SLC26A9 expression. Presently, the cultured HBE model appears superior to HNEs for investigations of SLC26A9, although further investigation of culturing conditions and of cell differentiation should be considered.
Interestingly, the most significant SLC26A9 locus SNPs from the CF GWAS 1,7 also associate with lung function measurements in several large international studies: PEF and FEV 1 /FVC ratio in participants from the UK Biobank aged 40-69 9,33 ; PEF and FEV 1 / FVC ratio in the Spirometa consortium 33 ; and for the FEV 1 /FVC ratio in 8-year-olds from the UK10K consortium 54 . These results align with our findings that, after correction of CF-causal CFTR variants with modulators, SLC26A9 locus SNPs are associated with improved lung function and CFTR function.
It is noteworthy that decreased spirometry is diagnostic for other obstructive lung diseases such as COPD 55 . Several studies have reported similar pathobiology cascades between CF and COPD due to dysfunctional CFTR and environmental risk factors for COPD [56][57][58][59][60] . For example, CFTR chloride channels show reduced capacity as a result of tobacco smoke and may result in the mucus obstruction characteristic of COPD 59,61 that is akin to that seen in CF. If SLC26A9 augments chloride transport, SLC26A9 agonists could also be an effective therapeutic strategy for COPD. In support of this premise is the association evidence highlighted here demonstrating rs7512462 is a modifier of lung function in Shrine et al. 33 , with significant colocalization evidence between the CF and their UK Biobank + Spirometa Consortium summary statistics at the SLC26A9 locus, reflective of a common underlying genetic contribution.
A role for SLC26A9 in the CF precision medicine landscape is an exciting prospect. SLC26A9 shows desirable characteristics 9 as an alternative therapeutic target for CF, including the urgent need for options for CF individuals with MF alleles. The association between rs7512462 genotype and response to existing pharmaceuticals indicate the potential to refine personalized combinations of modulators, and there is also support that SLC26A9 agonists may yield benefit to any existing pharmacological or gene correction paradigm in CF, independent of CFTR genotype. Further research should also address the potential for SLC26A9 agonists to improve measures of lung function in non-CF populations.  The colocalization p-value assesses evidence for common genetic contributions between the corresponding study phenotype and the meconium ileus CF GWAS 1 at chr1:205,895,000-205,921,000 locus. The effect allele in all studies is C, and the other allele is T, except for type 1 diabetes, where the effect size is not reported for the SNP in the original summary statistics from the listed publication (PMID). PheWAS significance level is p < 1.05 × 10 −5 after adjusting for 4756 GWAS in the database (alpha 0.05). UKBB refers to an analysis using the UK Biobank. 'Colocalization P-value' represents the p-values from the colocalization analysis calculated here with the CF GWAS summary statistics in Gong et al. 1 for the corresponding phenotype, using LocusFocus 46 calculated on the chr1: 205,895,000-205,925,000 region in hg19. NA in this column reflects a lack of information available to carry out the colocalization analysis.

Ethics approval
The Canadian CF Gene Modifier Study (CGMS) was approved by the Research Ethics Board of the Hospital for Sick Children (# 0020020214 from 2012 to 2019 and #1000065760 from 2019-present) and all participating sub-sites. Written informed consent was obtained from all participants or parents/guardians/substitute decision makers prior to inclusion in the study. The CGMS is also approved by the Research Ethics Board of the Hospital for Sick Children for the usage of public and external data. The US PROSPECT study provides data from a clinical trial registered at https:// clinicaltrial.gov, identifier NCT02477319, and we obtained these data through application to the US CFF at https://www.cff.org/researchers/cffoundation-biorepository.

Study design
The CGMS is a registry-based observational genetics study. The US Part B Cystic Fibrosis Foundation (CFF) Therapeutics PROSPECT study is an observational study of modulator treatment. The current study investigates the relationship between rs7512462 and lung function pre-and post-CFTR modulators in Canadian and US CF cohorts; in the general population; and in a case control study of chronic obstructive pulmonary disease (COPD). Clinical data collected as part of the CGMS include forced expiratory volume in 1 s (FEV 1 ), age, sex, and height which are all obtained from the Canadian CF Patient Data Registry (CCFRD), with occasional augmentation by chart review at the participating sites. Genetic data is linked to clinical data through an approved data access agreement with Cystic Fibrosis Canada. Part B of the CFF Therapeutics PROSPECT observational study 34 (https://clinicaltrials.gov/ct2/show/NCT02477319) in the US evaluated the effectiveness of LUM/IVA and collected buffy coat and clinical data from individuals with two copies of the Phe508del variant who were prescribed LUM/IVA. We received the PROSPECT study buffy coat and corresponding clinical data from the US CFF and extracted DNA for genotyping.

Phenotypes, sampling and inclusion criteria
Lung function severity in the absence of modulator treatment for participants in CGMS was measured by Saknorm, the survival adjusted average CF-specific Kulich FEV 1 percentiles that are normalized using 3 years of data in samples 6 years or older 5,31,35 . For participants whose recruitment year was after 2008, Saknorm is calculated using Canadian CFspecific reference equations from 2008 to 2014 63 , rather than US reference equations 64 . Saknorm enables comparison of CF lung disease across the age spectrum where there is differential mortality and is therefore used as a phenotype in CF genetic association studies. After quality control (QC), n = 89 participants with at least one gating mutation, n = 1266 with homozygosity for Phe508del and n = 63 with two MF alleles in trans are included in the analysis. Of these participants, 54 individuals with at least one gating mutation and 1013 who are homozygous for Phe508del were also included in a previous publication 31 (Fig. 1). CGMS participants prescribed IVA or LUM/IVA were also included in the treatment response study, together with the US PROSPECT participants for LUM/IVA (Fig. 1, Table 1). Similar to Strug et al. 31 , all participants included for modulator lung response analysis had a baseline measurement between 30 and 96 FEV 1pp measured within 3 months prior to, or on, the treatment initiation date. The CGMS data is obtained from the CCFRD with variable longitudinal entry depending on the resources of the individual clinics across the country while PROSPECT collected regular measurements at 1, 3, 6, and 12 months after treatment initiation. To account for this difference, the LUM/IVA treatment response is defined as the difference in FEV 1pp between the first visit on treatment within 5-7 months and that measured at baseline following convention in the literature 65 . For the IVA study, aligned with reported treatment responses seen by 15 days 10 , the difference between mean FEV 1pp within 15 to 400 days to FEV 1pp baseline 31 , as well as the difference between the longitudinal FEV 1pp within 15 to 400 days to FEV 1pp baseline are used. We also investigated an IVA analysis with treatment response defined as the difference from the first FEV 1pp within 15 to 60 days after the treatment initiation; the conclusion is comparable but because the sample size is reduced to 33, we report the mean treatment response and longitudinal treatment response in 15 to 400 days. After phenotype and genotype QC, 91 participants from PROSPECT and 104 CGMS participants were included in the genetic analysis for LUM/IVA and 45 for the IVA study (see Supplementary Table 1 for sample exclusion; Fig. 1).
We also investigated CFTR function in 46 CF Canada SickKids Program in CF Individualized Therapy 39 (CFIT) participants homozygous for Phe508del whose nasal epithelia were brushed, cultured to passage 2 (P2) and mounted in a circulating Ussing chamber. Thirty-six individuals who underwent brushing prior to modulator initiation and for whom we had rs7512462 genotype are included in the Ussing chamber analysis (Fig. 1).
A subset of the CGMS cohort (n = 82; Fig. 1) and 6 healthy controls had RNA from their nasal cells sequenced as part of CFIT. In addition, we have sequenced nasal cells for 9 CFIT samples cultured to passage 3 (P3) at twotime points (14-16 days and 26-30 days). We also sequenced the RNA from de-identified CF individuals who underwent lung transplantation, obtained from paired human bronchial and nasal epithelia; n = 17 independent pairs of uncultured primary HNE and HBE; and n = 16 cultured HNE and HBE pairs. The HBE samples were collected by bronchoscopy, using a bronchoscopic cytology brush to brush the bronchial airway lumen proximal to the anastomosis. HNE samples were collected by nasal brushing from the inferior turbinate using a 3 mm diameter sterile cytology brush (MP Corporation, Camarillo, CA).
Illumina Smart-Seq2 single-cell RNA data from three individuals were obtained from the Human Lung Cell Atlas 40 . The donors consisted of two males and one female, aged between 46 to 75 years. The samples were freshly resected lung tissues obtained during surgery with confirmed normal histology (except for very mild emphysema in some of the samples from one individual).
Summary association statistics from other GWAS phenotypes were obtained from the GWAS ATLAS 44 (See Methods subsection PheWAS data extraction and Colocalization with CF GWAS summary statistic). Association between rs7512462 and lung function in a non-CF population was assessed using the UK Biobank data under application #40946, or was investigated using summary statistics from Shrine et al. 33 when available.

Association analyses of rs7512462
To assess whether rs7512462 was associated with Saknorm in the different CFTR genotype groups, we carried out a stratified analysis, separately for those with gating variants (or the subset specifically with G551D), those homozygous for Phe508del and those with two minimal function (MF) variants, using both additive and recessive models. In each CFTR genotype subgroup, the association analysis used the R package 'rms' 66 to obtain a robust variance estimator and the linear regression included an indicator for which reference cohort was used to calculate the Saknorm phenotype. For individuals with at least one G551D variant, the rs7512462 association in CGMS is meta-combined with that from four previously published cohorts: a cohort from France 32 and 3 from the United States 36 . For CF individuals homozygous for Phe508del or for those with at least one gating mutation, the rs7512462 association results are also meta-combined with those reported in Corvol et al. 32 . Meta-analysis is implemented using inverse variance and sample size weighting with the R functions 'metagen' in the package 'meta' 67 and 'rma' in the package 'metafor' 68 , respectively and the R package 'forestplot' is used to generate forest plots 69 . To assess whether lung function is different between the three CFTR genotype groups, we regressed Saknorm on two indicators for the three CFTR genotype categories. Additionally, to assess whether the effects of rs7512462 genotype on Saknorm were different between the three CFTR genotype groups, we included a SNP-CFTR interaction term in the regression model and performed an F-test using the aov function. All analyses adjusted for the reference cohort used to calculate Saknorm and only two-sided p-values less than 0.05, and with the direction that the C allele is beneficial, are considered significant.
We also use multivariable regression with robust variance estimation to assess the association of rs7512462 with the modulator FEV 1pp response, where rs7512462 is coded recessively to align with what is observed in exploratory data analysis (Fig. 2). For the association with treatment response to IVA, covariates include age and FEV 1pp at baseline. To account for the variable follow-up days and the variable baseline measurement time prior to the treatment initiation, we also implement a linear mixedeffect model for participants on IVA and with follow-up measurements between 15 to 400 days using the R function 'lmer' in the package 'lme4' with a random intercept 70 . For treatment response to LUM/IVA, besides rs7512462, age and FEV 1pp at baseline, principal components (PCs) were also included to adjust for population structure, which were calculated from the PROSPECT or the combined CGMS and PROSPECT studies by the R function PC-AiR in the GENESIS package 71,72 using the kinship matrix estimated by KING 2.2.4 73 . The significant PCs were selected based on the Tracy-Widom test using the function twtable in POPGEN of Eigensoft 74 ; 7 PCs were included for the US PROSPECT analysis and 4 PCs for the US PROSPECT + CGMS combined analysis. Analysis of the Ussing chamber data to assess association of CFTR functional response to CFTR modulators with rs7512462 uses multivariable regression with a robust variance estimator, adjusted for a binary indicator of culture media. The boxplots are obtained using the functions ggplot and geom_boxplot from the package ggplot2 in R 75 and the function geom_jitter from ggplot2 is used to overlay the individual measurements in a stripchart.
Genetic association analyses for spirometry measures in the UK Biobank were conducted using imputed (v3) phenotypic data obtained from the UK Biobank 76,77 . In-house scripts, R package rbgen and C++ tool bgenix 78 , were used to index, subset and perform association analysis using imputed dosage 76 . COPD was defined according to the GOLD (levels 2 to 4) criteria of moderate to very severe airflow limitation 48 (FEV 1 /FVC ratio < 0.7 and FEV 1pp < 80%). Prior to analysis, spirometry and genotyping quality control (QC) were carried out as suggested in Shrine et al. 33 , with the exception of kinship and ethnicity analyses, where we opted to use KING's (v.2.0) -unrelated option 73 and the UKBB's UID 22006 for the identification of Europeans. This QC method is more conservative and yielded 263,461 participants compared to 321,047 participants in Shrine et al. 33 . We removed individuals with incomplete data for sex (UID = 22001), age (UID = 21022), height (UID = 50), and smoking status (ever/never, UID = 20160). FEV 1pp was calculated using the Global Lung Initiative (GLI) calculator using the best FEV 1 , age (UID = 21022), height (UID = 50), and sex (UID = 22001). From these, 22,071 participants fit the spirometrically-defined COPD criteria. Summary statistics for population-level GWAS of spirometry (FEV 1 / FVC ratio and PEF) were obtained from a meta-analysis of the UKBB and SpiroMeta Consortium 33 . Phenotypic analysis at the SLC26A9 locus included the best measures for PEF, FEV 1pp, and FEV 1 /FVC ratio. The best measure for FEV 1 (UID = 20150) and FVC (UID = 20151) were defined as the highest measure from the array of values of up to three blows (UID = 3063 and 3062, respectively) that were deemed acceptable according to blow quality metrics (UID = 20031): "blank", "ACCEPT", "BELOW6SEC ACCEPT" and "BELOW6SEC"; and a back-extrapolated volume (derived from the blow curve time series, UID = 3066) greater than 5% or less than 150 mL, as explained in Shrine et al. 33 . The best FEV 1 /FVC is calculated from the selected best FEV 1 and FVC after determining blow reproducibility within 250 mL from any other blow as explained in Shrine et al. 33 . Principal components were calculated using flashPCA2 v2.1 79 which is designed for Biobank-scale data. All association models included three principal components, sex, age, age 2 , sex × age, sex × age 2 , and smoking (ever/ never). All spirometry measures were inverse rank normal transformed prior to association analysis using the RNOmni R package's (v0.7.1) rankNorm function 80 . Association summary statistics from each of the lung function phenotypes were then used for colocalization analysis in LocusFocus v1.4.9 46 .
Phasing and haplotype analysis of CGMS in CF participants homozygous for Phe508del SHAPEIT version 4.2.0 81 was used to completely phase a multi-sample VCF from n = 447 CGMS participants sequenced using 10X Genomics linkedread genome sequencing technology (10XG; available at https://www. cysticfibrosis.ca/our-programs/cf-registry/requesting-canadian-cf-registrydata); so that it could be used as a reference panel for imputation in the region chr1:205903051-205953456 (GRCh38). Then SHAPEIT 4.2.0 81 was used again to phase the multi-sample VCF file from the imputed genotype data of the CGMS 1 using the completely phased VCF from 10XG as the reference panel. LiftoverVCF from picard tools (v2.18.0; https://gatk. broadinstitute.org/hc/en-us/articles/360037060932-LiftoverVcf-Picard-) was used to lift over the imputed VCF from GRCh38 to GRCh37.
To mimic this haplotype analysis we used imputed genotype data from the CGMS 1 in individuals homozygous for Phe508del (n = 1266). The imputed data included 40 of the 41 variants used by Lam et al. 37 , with the multi-allelic variant rs144469431 removed. After phasing them with the n = 447 10XG sequence data, we implemented two analyses. The first analysis used the same PLINK command as in Lam et al. 37 in an unrelated set of CGMS Phe508del/Phe508del participants (n = 1164). The second analysis was in a larger subset that included related individuals and conducts haplotype association using linear regression with a robust variance estimate that accounts for the relatedness (n = 1266), comparing each of the eight haplotypes studied in Lam et al. 37 to a comparison group that includes all others using an additive model.

Cell culturing
Cell culturing was carried out in the same manner as described previously 82 for HNE samples from 9 CFIT participants used to investigate expression differences with culturing time (14 versus 28 days); for the 16 paired cultured HNE and HBE samples used to investigate the difference in SLC26A9 expression across these model systems; and for 46 nasal brushes from individuals homozygous for Phe508del enrolled in the CGMS that were studied in Ussing chamber. Briefly, nasal epithelial cells were isolated and expanded to passage 1 from nasal brushes in the expansion media PneumaCult™ Ex (STEMCELL Technologies) containing 5 μM Rho Kinase inhibitor Y27632 (Selleck Chemicals) and an antibiotic cocktail (penicillin 100 units/mL, streptomycin 100 µg/mL, amphotericin 0.25 µg/mL, tobramycin 80 µg/mL, vancomycin 16 µg/mL, metronidazole 32 µg/mL, meropenem 8 µg/mL, septra (trimethoprim/sulfamethoxazole) 16/80 µg/mL, colistimethate 6 µg/mL). A subset of earlier enrolled samples for Ussing chamber study were cultured using an alternative media using an antibiotic cocktail including penicillin 100 units/mL, streptomycin 100 µg/ mL, and amphotericin 0.25 µg/mL. We refer to this alternative media as non-standard media in the analysis.
These expanded cells from homozygous Phe508del samples were seeded to collagen-coated Transwell inserts (6.5 mm diameter, 0.4 μm pore size, Corning) at a seeding density of 1 × 10 5 cells per well at passage 2 (P2). Upon confluency, the basolateral media was changed to differentiation media PneumaCult™ ALI (STEMCELL Technologies) containing penicillin 100 units/mL and streptomycin 100 µg/mL. The media was refreshed daily for 7 days then alternate days and any fluid collecting in the apical side was carefully aspirated until the cells reached approximately air-liquid interface (ALI) day 14 for Ussing studies.
Expanded cells for RNA sequencing were further expanded in PneumaCult™ Ex plus (STEMCELL Technologies) media containing the antibiotic cocktail (penicillin 100 units/mL, streptomycin 100 µg/mL, amphotericin 0.25 µg/mL, tobramycin 80 µg/mL, vancomycin 16 µg/mL, metronidazole 32 µg/mL, meropenem 8 µg/mL, septra (trimethoprim/ sulfamethoxazole) 16/80 µg/mL, colistimethate 6 µg/mL). Passage 3 cells were seeded in collagen-coated Transwell inserts (6.5 mm diameter, 0.4 μm pore size, Corning) at a seeding density of 1 × 10 5 cells per well. The cells were maintained in Pneumacult™ Ex plus media until the Transwells were fully confluent. Upon confluency, the media was changed to differentiation media PneumaCult™ ALI (STEMCELL Technologies). These cells were maintained in an air liquid interface by changing the basolateral media daily for a period of one week, following which the media was changed on alternate days for a period of 14 to 28 days. The 9 HNE samples to study the expression difference at different culture times were sequenced at two time points (14-16 days and 26-30 days) and the paired HNE and HBE samples were well-differentiated by 3 weeks before undergoing sequencing.
Ussing chamber studies with primary human nasal epithelial cells Cell monolayers from primary human nasal epithelial (HNE) brushes were mounted in non-perfused P2300 Ussing chambers containing Krebs Bicarbonate buffer (126 mM NaCl, 24 mM NaHCO 3 , 2.13 mM K 2 HPO 4 , 0.38 mM KH 2 PO 4 , 1 mM MgSO 4 , 1 mM CaCl 2 and 10 mM glucose). The buffer solution was maintained at pH 7.4 and 37°C and continuously gassed with a 5% CO 2 /95% O 2 mixture. Transepithelial voltage was recorded using a VCCMC6 amplifier (Physiologic Instruments, San Diego CA) in open-circuit mode and the baseline resistance was measured, following brief 1 µA current pulses every 30 s 83 to obtain calculated equivalent short-circuit currents (I eq ), which was calculated using Ohm's law. Passage 2 confluent cultures were treated with either 0.1% DMSO or CFTR modulators: acute application of Vx770 with 3 µM VX-809 (Selleckchem Cedarlane, Canada) or 3 µM VX809 + 1 µM of an experimental amplifier (Proteostasis Boston, USA; 42) added to the basolateral ALI media for 24-48 h 83 prior to Ussing experiments. CFTR function was assessed following inhibition of the epithelial Na + channel with amiloride (30 µM, Spectrum Chemical, Gardena, CA) and following cAMP activation with forskolin (10 µM, Sigma-Aldrich, US) in the above treated monolayers. Forskolin-stimulated currents mediated by CFTR were measured as change in current after application of forskolin (ΔI eq -forskolin; µA/cm 2 ). The genotype data from the CGMS enabled stratification of CFTR function by rs7512462 genotype to determine whether increased function correlated with genotype upon exposure to the drugs.

RNA sequencing, quality control, and analysis
The CF human nasal epithelia (HNE) cells were sequenced on the Illumina HiSeq 2500 platform (Illumina Inc. San Diego, California, USA) and aligned as described in Eckford et al. 39 . Expression counts were quantified using RNA-SeQC (ver. 2.0.0) and normalized to transcripts per million (TPM) 84  To compare the expression level for SLC26A9 across the different airway models, we calculated the TPM from naïve HNE for 82 CGMS participants and 6 healthy controls, 16 pairs of de-identified cultured HNE and HBE, 17 pairs of de-identified naïve HNE and HBE and 9 CFIT cultured HNE samples. eQTLs were calculated from 79 CGMS participants for whom both genotype and RNA sequence from naïve HNEs were available (Fig. 1). Quality control required TPM ≥ 0.1 and read counts ≥ 6 in greater than 20% of the sample to be analyzed. FastQTL (ver. 2.0) was used to conduct differential gene expression analysis of TMM-normalized read counts on SNP genotypes 86 . Covariates included the top 3 genotype principal components, 15 probabilistic estimation of expression residual (PEER) factors, sample study source, sex, genotyping platform, RNA integrity number (RIN) and PTPRC/CD45 gene expression which adjusts for immune cell composition in the samples. The genotype principal components and PEER factors were generated using R packages GENESIS 71,72 and PEER 87 , respectively. The average expression level of SLC26A9 was compared across HBE and HNE in both cultured and naïve paired tissues. Paired t-tests were conducted using the t.test() function in R v3.6.1 88 , based on TPM.
For the Human Lung Atlas single-cell data 40 , cellular expression profiles were clustered by nearest-neighbor for visualization. RNA count data and cell annotations provided in this dataset were used; counts were normalized (library-size corrected) to 1,000,000 reads per cell and logscaled. Scanpy v1.8.2 89 in Python 3.7 was used for the CFTR and SLC26A9 single-cell expression visualization of the Human Lung Cell Atlas data.
A subset of these samples with expression from more than 500 genes and greater than 50,000 total mapped reads were used for statistical modeling of CFTR-SLC26A9 co-expression. Library sizes were normalized using the TMM method. Log-transformed read counts per million (log-CPMs) were generated using the normalized library sizes and were used to calculate the Spearman's correlation between the CFTR and the SLC26A9 genes. Expression raw read count from CFTR was regressed on that of SLC26A9 in the zero-inflated negative binomial model analysis with a log link function. Normalized library size was adjusted using an offset term. Zero-inflation was modelled by the detection rate variable which is the proportion of genes with expression data in a cell. The effect size estimate indicates log-ratio relations between CFTR and SLC26A9 expression. The Spearman's correlation and the zero-inflated negative binomial model analysis were performed by R function cor.test() and the zeroinfl() function in the package pscl 90 , respectively.