Genome-wide bidirectional CRISPR screens identify mucins as host factors modulating SARS-CoV-2 infection

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes a range of symptoms in infected individuals, from mild respiratory illness to acute respiratory distress syndrome. A systematic understanding of host factors influencing viral infection is critical to elucidate SARS-CoV-2–host interactions and the progression of Coronavirus disease 2019 (COVID-19). Here, we conducted genome-wide CRISPR knockout and activation screens in human lung epithelial cells with endogenous expression of the SARS-CoV-2 entry factors ACE2 and TMPRSS2. We uncovered proviral and antiviral factors across highly interconnected host pathways, including clathrin transport, inflammatory signaling, cell-cycle regulation, and transcriptional and epigenetic regulation. We further identified mucins, a family of high molecular weight glycoproteins, as a prominent viral restriction network that inhibits SARS-CoV-2 infection in vitro and in murine models. These mucins also inhibit infection of diverse respiratory viruses. This functional landscape of SARS-CoV-2 host factors provides a physiologically relevant starting point for new host-directed therapeutics and highlights airway mucins as a host defense mechanism.

(TMPRSS2) on the cell surface, or in its absence by the lysosomal endopeptidase cathepsin L (CTSL) following clathrin-mediated endocytosis [6][7][8] . The TMPRSS2-mediated cell-surface entry route is considered dominant for SARS-CoV-2 in lung epithelial cells because inhibition of TMPRSS2, but not CTSL, in primary lung epithelial cells is sufficient to inhibit viral infection 9 . Further, ACE2 and TMPRSS2 are largely co-expressed by the main cellular targets of SARS-CoV-2 in vivo, such as epithelial cells within the lower and upper airway, the nasal passage and the gut 10,11 .
Each step of the viral life cycle is influenced, either positively or negatively, by a vast array of host factors. Although recent loss-of-function (LOF) screens have begun defining host factor requirements for SARS-CoV-2 infection, these studies employed host gene knockout (KO) approaches either in nonepithelial cell lines or in cell lines that do not endogenously express ACE2 and TMPRSS2 (refs. 8,[12][13][14][15][16]. Although LOF screens can be powerful for the identification of proviral genes, gain-of-function (GOF) screens can identify antiviral factors that mediate viral restriction upon upregulation. Performing screens in a bidirectional manner can therefore illuminate host pathways with bimodal roles and provide a more comprehensive view of viral dependencies and potential targets for host-directed therapeutic development. We therefore set out to define the host-pathogen interactions required for facilitating or restricting SARS-CoV-2 infection in Calu-3 cells, a human lung cell line endogenously expressing both ACE2 and TMPRSS2.
We conducted genome-scale LOF and GOF CRISPR screens to generate a systematic functional map of host dependencies and host restriction factors. Pathway analysis and secondary validation of top screen hits revealed diverse cellular components involved in modulating cellular proliferation, intercellular junctional complexes, the cytoskeleton, inflammatory signaling and mucin glycoproteins. The gene hits identified in our bidirectional dataset are differentially expressed in single-cell RNA sequencing (scRNA-seq) datasets of lung epithelial cells from healthy individuals and COVID-19 patients, underscoring their physiological relevance.
We further demonstrate an antiviral role for membrane-anchored mucins in vitro and in mouse models of SARS-CoV-2 infection, and find that this antiviral activity extends to diverse respiratory viruses. Taken together, our bidirectional CRISPR screens dissect the densely interconnected landscape of proviral and antiviral host factors, highlighting mucins as potential restriction factors of SARS-CoV-2 infection.

Identification of proviral and antiviral SARS-CoV-2 host factors
To systematically define proviral and antiviral host factors in a physiologically relevant cellular context, we conducted both LOF and GOF CRISPR screens using virus-mediated cell death as a functional readout (Fig. 1) 17 . We utilized a human lung epithelial cell line (Calu-3) that expresses ACE2 and TMPRSS2, is highly permissive to SARS-CoV-2 infection, and exhibits a cytopathic effect (CPE). We first transduced Calu-3 cells with constructs encoding Cas9 nuclease (LOF) or the synergistic activation mediator transcriptional activators dCas9-VP64 + PP7-P65-HSF1 (GOF), followed by lentiviral delivery of CRISPR guide libraries at low multiplicity of infection (MOI) (Fig. 1a) 18,19 . We confirmed that our SARS-CoV-2 isolate possessed a functional furin cleavage site and infected cells in a manner dependent on TMPRSS2 (and independent of CTSL) in this cell line (Extended Data Fig. 1). Our LOF and GOF screens were performed with three and four biological replicates, respectively, while maintaining greater than 1,000× single guide RNA (sgRNA) coverage (Extended Data Fig. 2). Following SARS-CoV-2 infection, genomic DNA was harvested after at least 70% CPE was observed (~5 d post infection). Next-generation sequencing of sgRNAs enabled identification of genes enriched or depleted relative to uninfected control cells (Supplementary Tables 1-3). We interpreted genes emerging from the LOF-enriched (Fig. 1b) and GOF-depleted (Fig. 1c) categories to be proviral, and genes from the GOF-enriched category (Fig. 1d) to be antiviral. We omitted further analysis of the LOF-depleted genes because of overlap of gene hits with essential genes, which would confound our analysis.
The top gene from both the LOF-enriched and GOF-depleted screens was ACE2, the SARS-CoV-2 cell entry receptor, validating the phenotypic selection of the host factor screens. TMPRSS2 was also highly enriched in our LOF screen, further confirming that SARS-CoV-2 infection of Calu-3 cells is dependent on the dominant SARS-CoV-2 S glycoprotein-priming mechanism in lung epithelial cells 9,20 . Other top proviral gene hits included AP1G1 (encoding a clathrin adapter) and CHUK (encoding IKK-alpha of the nuclear factor-κB (NF-κB) signaling pathway) in the LOF-enriched analysis, as well as NFE2 (encoding a transcription factor) and TRAF3IP2 (encoding an E3 ubiquitin ligase involved in interleukin 17 and NF-κB signaling pathways) in the GOF-depleted approach (Fig. 1b,c) [21][22][23][24][25][26] . The NF-κB pathway is commonly activated by viral infections and is known to regulate proinflammatory cytokine production, whereas the clathrin-mediated endocytosis pathway is directly involved in SARS-CoV-2 entry. Interleukin 17 signaling may be directly proviral by establishing a cellular environment conducive to viral replication or may negatively affect the host cell by promoting cell death [27][28][29][30][31] .
On the antiviral side, our GOF screen identified TEAD3, CCNE1 and ZNF275 as the top three enriched genes (Fig. 1d and  Supplementary Table 3). TEAD3 encodes a transcription factor involved in the transforming growth factor-β and Hippo signaling pathways, which can regulate cell proliferation 32 . Hippo signaling is activated by diverse stimuli including viral infection and is regulated through kinases such as the LOF-enriched hits TAOK1 and TAOK2 (Fig. 2a,c) 33 . CCNE1 encodes cyclin E1, a regulatory subunit of cyclin-dependent kinase 2, which is required for the G1/S cell-cycle transition 34 . Because numerous studies have reported cell-cycle arrest as a requirement for optimal viral replication, enhanced cell proliferation resulting from CCNE1 overexpression is likely refractory to SARS-CoV-2 replication [35][36][37] . In fact, the SARS-CoV-1 N protein has been previously shown to inhibit CCNE1, suggested to be a strategy of viral-mediated cell-cycle arrest to route cellular resources toward viral replication 38 . Finally, ZNF275 is a zinc-finger protein that may be involved in transcriptional regulation 39

SARS-CoV-2 host dependency pathways and interaction networks
To gain systematic insight into critical proviral host pathways and interactions for SARS-CoV-2 infection, we conducted two analyses of our top 100 enriched and depleted hits from our LOF and GOF screens, respectively. First, we performed a protein-protein interaction network analysis to define putative interactions and group hits into distinct interaction clusters based on both direct and indirect connections ( Fig. 2a and Extended Data Fig. 3a). Next, we performed gene set enrichment analysis to identify enriched biological pathways ( Fig. 2b and Extended Data Fig. 3b). We identified apoptotic signaling pathways that are expected to be directly involved in mediating virus-induced CPE, as well as canonical interferon signaling that could impact CPE through modulation of cell proliferation or cell death pathways. We also found components of NF-κB-mediated inflammatory signaling, cell-cell junctional complexes, cytoskeletal remodeling, adapter-mediated clathrin transport and cell-cycle regulation as enriched in our proviral screens across highly interconnected networks in Calu-3 cells ( Fig. 2a and Extended Data Fig. 3a).
To confirm the impact of specific genes on SARS-CoV-2 infection, we generated individual Calu-3 KO lines with two distinct guides per gene for the top five LOF-enriched hits and GOF-depleted gene hits-genes predicted to be critical for SARS-CoV-2 infection-and confirmed successful gene editing (Extended Data Fig.  4a,b). Assaying these cell lines using a median tissue culture infectious dose (TCID 50 ) assay, we observed reduced infectious virus production when ACE2, TMPRSS2, AP1G1, CHUK, ROCK1, RIPK4 and TAOK2 genes were disrupted (Fig. 2c). The observed proviral effects of RIPK4 and CHUK suggest that certain components of NF-kB signaling are beneficial for SARS-CoV-2 infection and may be actively regulated by the virus to promote a viral replicative niche. It has been previously observed that NF-kB pathways can have both proviral and antiviral roles, and can be actively regulated and rerouted by other coronaviruses and influenza A 30,40 .
For validation of our top five GOF-depleted hits, lentiviral transduction of CRISPR activation components confirmed that upregulation of ACE2, NFE2, TRAF3IP2 and TRDMT1 increased viral infection to varying levels relative to the nontargeting guide (NTG) control (Extended Data Fig. 3c,d). By contrast, COLQ upregulation had no impact on viral replication in our TCID 50 assay (Extended Data Fig. 3d). Additional GOF-depleted hits including MEX3B, APOL1 and CDKN2B also increased viral infection compared with NTG control cells (Extended Data Fig. 3e,f). As expected, western blot analysis confirmed successful overexpression of ACE2 (Extended Data Fig. 3g). CDKN2B encodes for the cyclin-dependent kinase inhibitor 2B, which controls the progression from G1 to S phase. Negative regulation of cell proliferation was the top pathway found among the depleted genes of the GOF screen (Extended Data Fig. 3b), supported by previous observations that the nucleocapsid protein (N) of SARS-CoV-1 actively inhibits cyclin-dependent kinases to arrest the cell cycle 38 .
One of our top hits confirmed by this analysis, AP1G1, encodes a subunit of the clathrin adapter complex AP1, and our proviral screens further identified two other subunits of AP1 (AP1M2, AP1B1) as well as known direct interaction partners (HEATR5B, AAGAB) (Fig. 2a). We hypothesized that AP1 components are required for SARS-CoV-2 cellular entry, given the known role of clathrin-mediated endocytosis as an entry route. To confirm this, we used vesicular stomatitis virus (VSV) encoding green fluorescent protein (GFP) and pseudotyped with SARS-CoV-2 S protein (VSV-CoV-2-S) to infect cells lacking AP1G1, AP1M2 or AP1B1, and observed attenuated viral infection relative to a NTG control (Extended Data Fig. 4c,d). By contrast, these AP1 components were not required for infection of VSV pseudotyped with the rabies virus glycoprotein (VSV-RABV-G), suggesting a specific requirement for AP1 components for SARS-CoV-2 entry (Extended Data Fig. 4e,f).

unique SARS-CoV-2 dependencies in lung epithelial cells
We next conducted a comparative analysis of our LOF screen hits with other recent LOF studies of SARS-CoV-2 proviral host factors in different cell lines 8,[13][14][15][16] . Other than ACE2, we noticed surprisingly low overlap in the top 100 hits, with only between zero and four genes overlapping in pairwise comparisons (Fig. 3a). Comparing across all screens, we found that the cell type chosen for infection is the dominant factor determining screen hit overlap, rather than differences in viral strain, MOI or timeline of genomic DNA harvest. A large cluster of gene hits enriched across at least three other screens-but not our Calu-3 screen-include genes encoding vacuolar-associated proteins important for endolysosomal trafficking (CTSL, ATP6AP1, GNPTAB, VAC14, WDR81, GDI2 and CCZ1B) (Fig. 3b). By contrast, members of the AP1 clathrin adapter complex were uniquely identified in our screen as well as in another recently reported Calu-3 screen (Fig. 2a) 41 . This discrepancy likely highlights differences between the CTSL-and TMPRSS2-dependent entry and trafficking routes of SARS-CoV-2. Taken together, host factors regulating endosomal maturation and CTSL function may be dispensable for SARS-CoV-2 infection when TMPRSS2 is present.
To assess the physiological relevance of our screen hits, we conducted comparative analyses of two distinct scRNA-seq studies: a dataset of human bronchoalveolar lavage fluid (BALF) from COVID-19 patients and controls 42 and a recent meta-analysis of healthy human lung cell types 43 . First, we identified differentially expressed genes in lung epithelial cells of COVID-19 patients relative to controls and compared these with the top 100 ranked genes from the screens described above. A greater fraction of our Calu-3 screen hits was expressed at higher levels in COVID-19 patients, compared with gene hits from other screens (Fig. 3c). These include genes involved in inflammatory responses (B2M, TAOK1) and cell proliferation (ELF3, TPT1 and NUPR1).

Analysis of host restriction factors against SARS-CoV-2
Our GOF screen gives us the opportunity to systematically investigate host antiviral factors: genes that restrict viral replication when overexpressed. Pathway and protein interaction network enrichment analysis highlighted inflammatory signaling, G-protein-coupled receptor (GPCR) signaling, transcriptional regulation, cell-cycle regulation and mucin glycosylation as key antiviral pathways emerging from our screen results (Fig. 4a,b). We selected our top five enriched GOF hits as well as a representative set of other hits across these pathways, including genes encoding the G1-S checkpoint regulator CCNE1, diverse transcriptional regulators (TEAD3, ZNF275, SPDEF, JDP2, TAF7L, ZNF248, MRGBP), host helicases (DDX28), ion channels (TMEM206, SLC44A2), modulators of cell-cell junctions (DOCK4) and membrane-binding proteins (CPNE3) (Extended Data Fig. 5a). TCID 50 assays of these GOF Calu-3 cell lines infected with SARS-CoV-2 demonstrated reduced viral titers compared with NTG controls to differing extents, confirming an antiviral role for many of these hits. However, upregulation of ZNF275 had no significant impact on viral load ( Fig. 4c and Extended Data Fig. 5b-d). We additionally characterized the extent of target upregulation using quantitative PCR with reverse transcription (RT-qPCR) (Extended Data Fig. 5a), which confirmed significant upregulation in all cases except for guide 1 for SPDEF and TEAD3. Successful SPDEF-and TEAD3-mediated viral restriction by TCID 50 assay for these guides in the absence of a detectable increase in gene expression may be explained by the later RT-qPCR time point relative to the TCID 50 assay, as the lentivirally integrated GOF components may have been silenced.
The antiviral effect of upregulating CCNE1, encoding a protein that drives cells into S phase, is consistent with the observation that upregulation of CDKN2B, encoding a cell-cycle inhibitor, increases viral replication (Extended Data Fig. 3f). Coronavirus regulation of the cell cycle by modulating interactions between cyclins and CDKs has been previously shown for SARS-CoV-1, murine hepatitis virus, porcine epidemic diarrhea virus and infectious bronchitis virus, and is a consistent strategy for rerouting host resources toward the coronavirus lifecycle 31 . TEAD3 and JDP2 may also regulate cell proliferation in ways that are detrimental to the virus 47,48 . Another GOF-enriched transcription factor, SPDEF, is critical for differentiation of goblet cells, a specialized cell type involved in mucus secretion 49 . Error bars denote mean ± s.e.m., n = 3 biological replicates. The dotted lines indicate the LOD of the assay as well as the NTG average. Significance was determined with individual two-sided t-tests between each GOF line and NTG. NS, not significant; *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. Several of our other GOF-enriched hits are known to interact with SARS-CoV-2 viral proteins. The DDX28 helicase interacts with the SARS-CoV-2 N protein and was previously reported to be a protective factor in an RNA interference screen for West Nile virus host factors 50,51 (Extended Data Fig. 5c). CPNE3, a phospholipid-binding protein, has been implicated as an interaction partner with SARS-CoV-1 nsp1 and viral RNA 52,53 (Extended Data Fig. 5d). The transcription factor IRF5 has a well-established antiviral role in interferon signaling (Extended Data Fig. 5b). Other interferon-stimulated genes identified in the GOF screen were also predicted to result in an antiviral state, including the lymphocyte antigen 6 family member (LY6E), previously reported to antagonize SARS-CoV-2 entry, a ubiquitin-specific peptidase (USP18) and the putative mitochondrion-localized dsRNA-sensor ZNFX1 (Fig. 1d) [54][55][56] .
Finally, we identified mucin glycosylation as a prominent antiviral pathway impacting SARS-CoV-2 infection (Fig. 4a,b). Mucins comprise a family of high molecular weight, heavily O-glycosylated glycoproteins and are the primary constituent of mucus lining the epithelial tract of the lungs and gut 57 . Membrane-anchored mucins (MUC1, MUC4, MUC13, MUC21) form a large, interconnected network in our GOF-enriched protein network analysis, along with the acetylglucosaminyltransferase (B3GNT6) and cell-surface proteins CD44 and ICAM1 also implicated in our screens (Fig. 4a). We decided to investigate membrane-anchored mucins further because of their lung localization, significance of enrichment as an antiviral pathway and lack of previous studies in the context of SARS-CoV-2 infection.

Membrane-anchored mucins restrict infection of SARS-CoV-2
To confirm that upregulation of the membrane-tethered mucins is sufficient to reduce SARS-CoV-2 infection, we generated and confirmed individual GOF cell lines for MUC1, MUC4, MUC13 and MUC21 in Calu-3 cells. We also produced a GOF line for the transmembrane glycoprotein CD44, which interacts with MUC1 and can be variably spliced to contain a mucin-like domain (Extended Data Fig. 6a-c) 58,59 . We found that at least one guide per gene in these GOF cell lines exhibited reduced SARS-CoV-2 viral loads compared with NTG controls (Fig. 5a,b).
To determine whether endogenous levels of membrane-anchored mucins could modulate SARS-CoV-2 infection, we treated parental Calu-3 cells with secreted protease of C1 esterase inhibitor (StcE), a bacterial protease that selectively cleaves mucins 60 . We confirmed the specificity of StcE digestion using western blot and flow cytometry for decreased levels of CD44 and MUC1, and showed depletion of StcE binding sites by treatment with a fluorescently labeled enzymatically inactive form of StcE ( Fig. 5c and Extended Data Fig. 6c-f) 58 A two-tailed t-test was performed between mucin WT and KO groups, n = 8 mice. f. RNA-ISH probing for SARS-CoV-2 RNA on paraffin-embedded lung sections. Shown are representative lungs from n = 8 mice. Scale bars, 2 mm (upper) and 500 µm (lower). DAPI, 4,6-diamidino-2-phenylindole. g, Quantitation of f. A two-tailed t-test was performed between mucin WT and KO groups, n = 8 mice. h, Quantitation of infectious viral particles per lung lobe by plaque assay from n = 8 mice. A two-tailed t-test was performed between mucin WT and KO groups. Error bars denote mean ± s.e.m. NS, not significant; *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. untreated controls, indicating that lung cells lacking endogenous mucins are more permissive to viral infection (Fig. 5d,e).
We next assessed mucin gene expression in RNA-seq datasets of lung tissues to determine whether membrane-anchored mucins are regulated in response to SARS-CoV-2 infection. Across six different model systems in addition to postmortem human lung tissue, we observe consistent upregulation of mucins from 2 days up to 2 weeks postinfection ( Fig. 5f)  replicates. e, Pseudotype entry assay measuring GFP + cells at 8 h postinfection comparing infection rates between NTG or mucin-overexpressing cells with and without StcE pretreatment. The full dataset is displayed in Extended Data Fig. 9. n = 5 biological replicates; two-tailed t-tests were performed comparing indicated pairs. The dotted line indicates the NTG average for two separate guides, with and without StcE treatment. f, Schematic depicting a SARS-CoV-2 binding assay. g, RT-qPCR of relative levels of viral N gene copies following a virus binding assay conducted on Calu-3 cells at an MOI of 2.5 with or without StcE treatment. Significance comparing StcE-treated cells versus untreated cells was determined by two-sided t-test, n = 6 biological replicates. Error bars denote mean ± s.e.m. NS, not significant; *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.
to CD44 (Fig. 5g). Binning individual cells based on the presence (vRNA+) or absence (vRNA-) of SARS-CoV-2 RNA, we show that MUC1 was expressed at lower levels in SARS-CoV-2-positive versus SARS-CoV-2-negative epithelial progenitor cells in COVID-19 patients, consistent with a protective role ( Fig. 5h and Extended Data Fig. 7). Overall, these gene expression profiles suggest mucin upregulation is a broad antiviral host response across multiple species.
Next, we asked whether membrane-anchored mucins were protective against SARS-CoV-2 infection in vivo. Given that MUC1 and MUC4 emerged from our GOF screens, but not our LOF screens, and that these two mucins are both highly expressed in mouse and human lung epithelial cells (Fig. 5g), we reasoned that their roles in restricting SARS-CoV-2 infection may exhibit redundancy. To investigate multiplexed membrane-anchored mucin KOs, we infected a triple membrane-anchored mucin KO mouse (Muc 1 −/− /Muc4 −/− /Muc16 −/− ) with 10 3 plaque-forming units (p.f.u.) of a mouse-adapted strain of SARS-CoV-2 (MA10). We compared viral loads in the lungs of these mice with wild-type (WT) control animals, harvesting lungs at 2 d postinfection (Fig. 6c). Immunohistochemistry (IHC) staining for SARS-CoV-2 N protein and RNA in situ hybridization (RNA-ISH) targeting SARS-CoV-2 RNA revealed greater levels of viral antigen (Fig. 6d,e) and viral RNA (Fig. 6f,g) in the lungs of triple mucin KO mice, which also displayed higher viral titers compared with WT (Fig. 6h). Consistent with our in vitro data, membrane-tethered mucins are protective against SARS-CoV-2 infection in vivo.

Mucins modulate cell entry of SARS-CoV-2
We next sought to identify the stage of the viral life cycle that is affected by mucin upregulation. Because all four mucins enriched in our GOF screen are expressed at the host cell surface, we hypothesized that they reduce SARS-CoV-2 entry. Indeed, live imaging of cells infected with our VSV-CoV-2-S pseudotype virus demonstrated that CRISPR-mediated overexpression of all tested membrane-tethered mucins, including CD44, inhibited spike-mediated viral entry relative to NTG control, in agreement with our viral titer validation data (Fig. 7a,b). The StcE-mediated digestion of extracellular mucins in MUC4 and MUC1 GOF as well as NTG cell lines led to a dramatic increase in VSV-CoV-2-S infection compared with untreated cells, indicating that mucin removal renders cells more permissive to VSV-CoV-2-S infection, consistent with our in vivo data (Fig. 7c,e and Extended Data Fig. 9a). By contrast, endogenous expression or CRISPR-mediated overexpression of MUC4 in Calu-3 cells did not have an effect on VSV-RABV-G infection, suggesting that virus-specific interactions may govern the antiviral effect of mucins during viral entry (Fig. 7d). Given previous literature reporting a steric hindrance role of membrane-anchored mucins in inhibiting Influenza A virus infection 65 , we hypothesized that membrane-anchored mucins block binding of SARS-CoV-2 virions to cells. To test this, we conducted a SARS-CoV-2 binding assay on StcE-treated or control Calu-3 cells, which revealed a significantly higher level of bound virions to StcE-digested compared with undigested cells (Fig. 7f,g). Consistent with this steric hindrance hypothesis, we found that MUC4 GOF cells had a denser glycocalyx layer compared with control cells, as measured by cell-surface exclusion of different sized, fluorescently labeled dextran probes (Extended Data Fig. 8b-d). Taken together, our data indicate that endogenous and upregulated membrane-tethered mucins restrict SARS-CoV-2 entry, specifically at the step of initial cell binding.
Mucins are divided into two distinct classes: membrane-anchored mucins such as MUC1 or MUC4; and secreted, gel-forming mucins such as MUC5AC and MUC5B. Although diverse membrane-anchored mucins emerged from our GOF-enriched screen and were validated to restrict SARS-CoV-2 infection in vitro and in vivo, MUC5AC emerged as a hit from our GOF-depleted screen, suggesting a proviral role. To test the effect of MUC5AC in modulating SARS-CoV-2 infection, we generated MUC5AC and MUC5B GOF Calu-3 cells and validated their expression at messenger RNA and protein levels (Extended Data Fig. 9a,b). By contrast to membrane-anchored mucins, these lines did not show any protective effect against SARS-CoV-2 infection. Consistent with its depletion in our GOF screen, MUC5AC but not MUC5B upregulation increased susceptibility to the SARS-CoV-2 WA/1 clinical isolate and all tested variants except for B.1.429 (Extended Data Fig. 9c-f). The contrasting phenotypes of gel-forming and membrane-anchored mucins highlight the complex roles of mucins during SARS-CoV-2 infection.

Mucins modulate infection of diverse respiratory viruses
Finally, we investigated whether the antiviral effect we observed for membrane-anchored mucins against SARS-CoV-2 would extend to other respiratory viruses. First, we tested the effect of MUC1 and MUC4 GOF on the close relatives MERS-CoV and the bat coronavirus HKU5 pseudotyped with SARS-CoV-1 Spike (HKU5-SARS-COV-1-S). Although SARS-CoV-2 infection was restricted by overexpression of both MUC1 and MUC4, we found that MUC4 GOF had a more robust impact on HKU5-SARS-COV-1-S infection compared with MUC1, whereas the converse was true for MERS-CoV infection (Fig. 8a-c). Further, removal of endogenous mucins by StcE digestion had little impact on HKU5-SARS-CoV-1-S and MERS-CoV infection relative to undigested control cells (Fig. 8d). Extending this analysis to more diverse respiratory viruses, we show that mucin removal led to an increase in infection for influenza virus PR8, consistent with previous reports, as well as human coronavirus 229E (HCoV-229E) and human parainfluenza virus PIV3 (refs. 65,66 ). By contrast, no effect was observed on infection with HCoV-OC43, and a surprising reduction in infection was seen for respiratory syncytial virus (RSV) (Fig. 8e) 65,66 .
The protective function of mucins against 229E and PIV3 infection was further confirmed for MUC1-and MUC4-overexpressing cells (Fig. 8f-h and Extended Data Fig. 10a). Consistent with our observations for SARS-CoV-2, secreted mucins MUC5AC and MUC5B did not protect against infection for most viruses, with the exception of MERS-CoV (Extended Data Fig. 10b-g). Taken together, these data indicate an important role for mucins across respiratory viruses, with unique functions in promoting or restricting viral infection dependent on the mucin and virus, highlighting the complexity of host-pathogen interactions between respiratory viruses and lung mucins.

Conclusions
Here, we report the first genome-scale GOF screen exploring host factors influencing SARS-CoV-2 infectivity, in addition to the first LOF screen that exploits canonical ACE2-and TMPRSS2-mediated viral entry into lung epithelial cells (Supplementary Tables 1-3). The combination of bidirectional genome-scale screens enables the assignment of proviral or antiviral roles for individual genes into a systematic functional catalog of unique host factor dependencies. We further observed their robust enrichment in scRNA-seq datasets of healthy and infected human lung epithelium. Finally, our mechanistic investigation into the role of mucins during respiratory virus infection reveals a restrictive role of viral entry of multiple viruses by membrane-tethered mucins. By dissecting the interactions between SARS-CoV-2 and lung epithelial cells, these host factor screens provide a starting point for host-directed intervention and new antiviral therapeutic strategies.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41588-022-01131-x.
For the LOF screen, the Cas9 knock-in cell line transduced with Brunello sgRNAs were seeded at 3.5 × 10 7 cells per T225 flask and allowed to grow to 70% confluency. At this point, half of the cells were harvested for a day 0 (D0) time point to serve as a reference for sgRNA enrichment analysis. The remaining cells were infected with the SARS-CoV-2 USA/WA-1 isolate at an MOI of 0.05. Cells were incubated for 4 d until >70% CPE was apparent. Throughout the infection, the medium was replaced every other day with R10 + HGF. For the GOF screen, dCas-VP64 knock-in cells transduced with Calabrese sgRNAs were treated similarly except they were infected at an MOI of 0.05 for 5 d until >70% CPE was apparent. At 5 d postinfection, surviving cells were uplifted and lysed according to the manufacturer's instructions using a genomic DNA extraction kit, followed by heat inactivation (Zymo Research).
Next-generation sequencing. Guide RNA cassettes were amplified from extracted genomic DNA to generate Illumina sequencing libraries. Namely, 3 µg of genomic DNA was added per 50 µl PCR reaction using staggered primers to increase base diversity. PCR products were then pooled and purified using QIAquick PCR purification kits (Qiagen). Different libraries were then quantified with Kappaquant to determine relative concentrations of amplified product, then pooled to match concentrations. Pooled samples were sequenced by Illumina Novaseq. In addition to the uninfected D0 and infected time points, the lentiviral plasmid prep was also sequenced to assess any guide distribution skew resulting from transduction.

Computational analysis of CRISPR screens.
A previously published method for CRISPR screen analysis, MAGeCK 68 was used to rank genes based on redundant targeting guides via robust rank aggregation (RRA). Our LOF screen was performed with three biological replicates and our GOF screen with four biological replicates maintaining >1,000× sgRNA coverage. sgRNA enrichment and depletion were assessed in infected versus D0 uninfected samples for each paired sample using the paired flag. Each set of top 100 screen hits was determined based on MAGeCK RRA rank and the false discovery rate (FDR) in a negative control analysis. To investigate the relationships within each of these sets, we performed protein-protein interaction networks functional enrichment analysis with STRING 69 . Any genes that exhibited depletion or enrichment in the top 500 hits between the plasmid library and the uninfected control were removed from STRING analysis because they are likely confounded by affecting cellular growth or survival unrelated to viral infection. Nodes were resized based on MAGeCK scores. Specifically, the radius (r) of each node was calculated as a constant factor (c) scaled by 1.5 raised to the z-score (z) of the corresponding gene's negative log MAGeCK score: r = c × 1.5 z . z was calculated from the distribution of the negative log MaGeCK scores in the relevant top 100 gene set.

Secondary validation with individual guide RNAs.
Lentivirus for Cas9 KO was produced in six-well plates. In brief, low-passage HEK293FT cells were grown in DMEM supplemented with 10% FBS (D10 medium) and passaged using TrypLE (Gibco). For viral plasmid transfection, polyethylenimine 'Max' (PEI, linear, M r 40,000, VWR) at a concentration of 1 mg ml −1 and pH 7.1 was used. PEI (0.71 μg PEI per cm 2 cell culture area) was mixed with 1 ml of DMEM. pMD2.G, PAX and target-guide plasmid (0.178 μg cm −2 , at a pMD2.G/psPAX/guide plasmid ratio of 0.25:0.5:1 were mixed with 1 ml of DMEM. Following 10 min of incubation at room temperature, the DNA and PEI mixtures were mixed and incubated for 30 min at room temperature. Approximately 150,000 cells per cm 2 of 293FT cells in suspension were mixed with the PEI-DNA DMEM mixture, incubated for 5 min at room temperature, then plated in D10. After 48 h, the viral supernatant was harvested, centrifuged at 800xg for 5 min to remove cell debris, and used to transduce 800,000 Cas9-expressing Calu-3 cells. 24 h later, cells were selected with blasticidin. Lentivirus for CRISPR activators was produced as described above, scaled to T75 flasks. Clarified lentiviral supernatant was concentrated using LentiX Concentrator (Takara Bio). The concentrated aptamer guide RNA virus was used to transduce five million Calu-3 cells expressing dCas9-VP64 before blasticidin selection the following day.

Calu-3 validation infections and TCID 50 assay.
A total of 2 × 10 5 Calu-3 cells were seeded into 24-well plates and infected with SARS-CoV-2 48 h post-seeding at an MOI of 0.05. Viral inoculums were incubated with cells for 30 min at 37 °C, at which time they were removed, and cells were washed once with 1× PBS. Regular media R10 + HGF was then replaced, and cells were incubated for the times indicated in the figures (24-48 h postinfection). Plates were then frozen to lyse cells and thawed when ready to quantify virus by TCID 50 . To titer infectious viral particles, plates were thawed and viral lysates were serially diluted; each dilution was applied to eight wells in 96-well plates containing Vero E6 cells. Three days later, CPE was determined visually, and TCID 50 per ml was calculated using the dilution factor required to produce CPE in half of the wells (4/8) for a given dilution.

SARS-CoV-2 stock preparation and infections.
The USA-WA-1/2020 strain of SARS-CoV-2 used was obtained from BEI Resources. The original stock from BEI was passaged through a 0.45 µM syringe filter and then 5 µl was inoculated onto 80% confluent T175 flasks (Nunc) of Vero E6 cells to produce our p1 stock. The CPE was monitored daily, and flasks were frozen when cells exhibited ~70% CPE, ~48-72 h postinfection. Lysates were then thawed, collected and cell debris was spun down at 3,000 r.p.m. for 20 min. The clarified virus-containing supernatants were then aliquoted and infectious viral particles were quantified with a TCID 50 assay. To produce p2 working SARS-CoV-2 stocks, 5 µl of the p1 stock was inoculated onto 80% confluent T175 flasks of Vero E6 cells as described above. SARS-CoV-2 variant viral stocks were produced as described previously 70 . In brief, the B.1.351 isolate was derived from a nasal swab at Stanford Hospital. Nasal swab medium was inoculated onto Vero E6 cells in the BSL-3 facility at Stanford University. Vero E6 cells were monitored for CPE and harvested as above for p0 stocks. p1 and p2 stocks were generated as above. Evaluation of antiviral activity of chemical compounds was conducted as described previously 71 . In brief, cells were infected in 384-well plates at a MOI of 0.05 in a total volume of 6 μl per well. Cells were harvested and analyzed using CellTiter-Glo 2.0 reagent (Promega) once complete CPE was observed in dimethylsulfoxide-treated infected wells (96 h postinfection for Calu-3). Luminescence was read on a Spectramax L (Molecular Devices). Compound activity was measured in infected cells by comparing ATP levels in dimethylsulfoxide-and compound-treated cells. Cell viability was determined by comparing ATP levels in uninfected cells treated with dimethylsulfoxide and compounds.

Production and infection of Vero/furin cells.
To determine whether our SARS-CoV-2 viral stocks remained responsive to furin after passage in Vero E6 cells, we produced Vero E6 cells overexpressing the human furin gene. To this end, Vero E6 cells were transduced with lentivirus encoding the human furin gene. Cells were then selected with puromycin for three passages in 10 μg ml −1 puromycin. Furin-overexpressing cells (Vero/furin) along with control transduced cells (Vero), were then infected with SARS-CoV-2 at an MOI of 0.05 and TCID 50 was determined on whole-cell lysates 24 h postinfection. To further confirm that our viral stocks were responsive to furin cleavage, we also infected HPMEC/ ACE2 (described in Biering et al. 71 ), with SARS-CoV-2 at an MOI of 0.05. At 24 h postinfection syncytia (cell-cell fusion) was apparent and evaluated by phase contrast microscopy.
SARS-CoV-2 stock sequencing. Viral RNA was extracted from working stocks with the Qiagen RNA Easy Minikit (catalog no. 74104). Library preparation was performed by the Functional Genomics Laboratory (FGL), a QB3-Berkeley Core Research Facility at UC Berkeley. Total RNA samples were checked on Bioanalyzer (Agilent) for quality, and only high-quality RNA samples (RNA Integrity Number > 8) were used. At the FGL, SARS-CoV-2 amplicons were generated using the nCoV-2019 sequencing protocol v3 (protocols.io), and the integrity of PCR amplicons was checked with a fragment analyzer (Agilent). The library preparation for sequencing was done on Biomek FX (Beckman) with the KAPA hyper prep kit for DNA (Roche). Truncated universal stub adapters were used for ligation, and indexed primers were used during PCR amplification to complete the adapters and to enrich the libraries for adapter-ligated fragments. Samples were checked for quality on a fragment analyzer. Samples were quantified by Illumina Quant qPCR Kit (Kapa Biosystems), pooled evenly by molarity and sequenced on an Illumina NovaSeq6000 150PE S4. Fastq files were further processed to produce consensus sequences and allele frequencies using trimmomatic (v.0. 39 NR-48814) were generated by inoculating Vero E6 cells with ~0.01 MOI for 3 d to create a p1 stock. The p1 stock was used to inoculate Vero E6 cells and after 3 d, with ~50% CPE, supernatants were harvested, spun (450xg for 5 min) and filtered through a 0.45 µm syringe filter. Aliquoted viruses were stored at −80 °C. Plaque assay was used to calculate the titer of infectious virus in the generated stocks. Stocks of PIV3 (PIV3-GFP, ~10 7.4 TCID 50 per ml) and RSV (RSV-GFP1, ~10 7 TCID 50 per ml) were obtained from Viratree. Stocks were then expanded in T225 flasks of 80% confluent LLC-MK2 or Hep-2 cells, respectively, by inoculating cells with 100 µl of virus stocks. The CPE was monitored daily and flasks were frozen when cells exhibited ~70% CPE, usually ~72-96 h postinfection. Lysates were thawed, collected and cell debris was spun down at 10,000xg for 10 min. The clarified virus-containing supernatants were then aliquoted and infectious viral particles were quantified with a GFP + flow limiting-dilution assay using LLC-MK2 cells. HCoV-229E, HCoV-OC43 and influenza virus strain PR8 were obtained from ATCC. Stocks of HCoV-229E, HCoV-OC43 and PR8 were amplified in MRC5, HCT8 and MDCK cells respectively. For each of these viruses, stocks were prepared by infecting a T150 flask of cells with 50 µl of ATCC stock. Flasks were frozen down once 60-70% CPE was observed. Lysates were then thawed, collected and cell debris was spun down at 10,000xg for 10 min. The clarified virus-containing supernatants were then aliquoted and infectious viral particles were quantified via TCID 50 .
Generation of replication competent VSV pseudovirus. Recombinant VSV expressing eGFP and SARS-CoV-2-S (VSVdG-eGFP-CoV-2-S) was generated as previously described 16 . In short, VSVdG-eGFP (Addgene, plasmid #31842) was modified to insert in frame with the deleted VSV-G a codon-optimized SARS-CoV-2-S based on the Wuhan-Hu-1 isolate (GenBank: MN908947.3), which was mutated to remove a putative ER retention domain (K1269A and H1271A). The virus was rescued and passaged in Huh7.5.1 cells until widespread GFP fluorescence and CPEs were observed. Virus was propagated on Vero E6 cells and titrated on Vero E6 cells overexpressing TMPRSS2. Sequencing revealed additional mutations at the C terminus (1274STOP) and a partial mutation at A372T (~50%) in the ectodomain. Similar adaptive mutations were found in a previous published VSVdG-CoV-2-S 72 .
Mucin KO mouse experiments. Mice deficient in the three major transmembrane mucins (Muc1, Muc4 and Muc16) were used to confirm a role for transmembrane mucins in SARS-CoV-2 infection in vivo. Mice were generated by breeding individual Muc1-, Muc4-and Muc16-deficient mice (C57BL/6NTac congenic) to form triple transgenic homozygous Muc1/4/16-deficient mice [73][74][75] . WT control mice were provided by C57Bl/6NTac mice bred in the same colony. C57BL/6NTac mice (15-20 weeks old), of both genders, were used for in vivo experiments. All mice were housed individually in ventilated microisolator cages in a facility maintained at the University of North Carolina at Chapel Hill, on a 12 h day/ night cycle. Mice were fed a regular chow diet and given water ad libitum until the defined experimental endpoints. Researchers were not blinded during in vivo experiments. Triple deficient mice and WT mice were treated with mouse-adapted SARS-CoV-2 (10 3 p.f.u.) according to standard protocols and lungs were harvested at 2 d postinfection and lungs processed according to approved protocols 76 . IHC for SARS-CoV-2 nucleocapsid protein (Invitrogen, catalog no. PA1-41098) was performed on paraffin-embedded 5-μm tissue sections to detect virus followed by 3,3′-diaminobenzidine (DAB) development and hematoxylin II staining (performed by Animal Histopathology & Laboratory Medicine Core at the University of North Carolina). RNA-ISH was performed on paraffin-embedded 5-μm tissue sections using the RNAscope Multiplex Fluorescent Assay v.2 according to the manufacturer's instructions (Advanced Cell Diagnostics) and described previously 77 . Briefly, tissue sections were deparaffinized with xylene and 100% ethanol twice for 5 min and 1 min, respectively, incubated with hydrogen peroxide for 10 min and in boiling water for 15 min, and then incubated with Protease Plus (Advanced Cell Diagnostics) for 15 min at 40 °C. Slides were hybridized with a custom probe for SARS-CoV-2 Spike gene (RNAScope probe SARS-CoV-2, S gene encoding the spike protein, catalog no. 848561) at 40 °C for 2 h, and signals were amplified according to the manufacturer's instructions. Stained sections were scanned and digitized using an Olympus VS200 microscope with a 20x 0.8 NA objective. Images were imported into Visiopharm Software (v.2020.09.0.8195) for quantification. Lung tissue, SARS-CoV-2 nucleocapsid protein and SARS-CoV-2 Spike gene were quantified using customized analysis protocol packages to: (1) detect lung tissue using a decision forest classifier, (2) detect the DAB signal of SARS-CoV-2 using the green channel alone and threshold set to remove background staining, and (3) detect the probe signal based on the intensity of the Cy5 signal corresponding to SARS-CoV-2 Spike gene probe. All slides were analyzed under the same conditions. All animal work was approved by the Institutional Animal Care and Use Committee at University of North Carolina at Chapel Hill according to guidelines outlined by the Association for the Assessment and Accreditation of Laboratory Animal Care and the U.S. Department of Agriculture. All infection studies were performed in animal biosafety level 3 (BSL-3) facilities at University of North Carolina at Chapel Hill.
Statistics and data collection. Biological replicates in this study are defined as distinct experiments conducted with a unique aliquot of virus and a unique passage of indicated cells. Comparisons were made by individual t-tests comparing gene-edited cell lines to a corresponding NTG control.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All raw data associated with all figures are included in this submission and are also available upon request. Publicly available datasets in NCBI were used for Fig. 3 found at accession numbers: GSE145926 and GSM3660650. Publicly available datasets for Fig. 5 are found at accession numbers: GSE147507 (ref. 4 ), GSE152586 (ref. 63 ), GSE154104 (ref. 64 ) and GSE161200. SARS-CoV-2 stocks have been sequenced and raw sequencing data has been deposited to both NCBI GenBank at accession OM319524 and SRA at SRR17658563. Exact P values for all statistical comparisons are provided in Supplementary Table 7. Additional details on methodology are available in the attached supplementary note detailing the following experiments: StcE pretreatment experiments, Fluorescence-based infection tracking experiments, SARS-CoV-2 binding assay, qPCR validation of overexpressing cell lines, qPCR quantification for viral genome copies, flow cytometry, western blot, single-cell RNA meta-analysis of TMPRSS2 + ACE2 + ciliated lung epithelial cells, Bulk RNA-seq analysis of mucin gene expression in response to SARS-CoV-2 infection, RNA-seq analysis for COVID-19 clinical samples, and knockout analysis. Source data are provided with this paper. Fig. 2 | Quality control metrics for bidirectional SARS-CoV-2 host factor screens. a. Log fold-change (LFC) of top screen hits enriched in the LOF screen. Guides are shown separately, with screen replicates indicated by individual data points (log fold-change is calculated relative to day 0 pre-infection of the same cell population). n = 3 biological screen replicates. b. Log fold-change of top screen hits depleted in the gain-of-function screen. n = 4 biological screen replicates. c. Log fold-change of top screen hits enriched in the GOF screen. n = 4 biological screen replicates. Fig. 3 | SARS-CoV-2 proviral signatures revealed by depleted genes in the GOF screen. a. Protein-protein interaction network for top 100 depleted hits from the CRISPR GOF screen based on STRING analysis. Solid lines between genes indicate direct interaction, dashed lines indicate indirect connections. Nodes are color-coded by functional groups and scaled according to screen enrichment RRA score. b. Pathway analysis of top 100 depleted hits in the GOF screen indicates enriched pathways with putative antiviral roles. Circle size indicates the number of genes within each pathway, color indicates FDR of pathway enrichment. c-d. Individual guide TCID 50 validation of the effect of transcriptional upregulation of the top five putative proviral hits on SARS-CoV-2 viral titer in Calu-3 cells infected at an MOI of 0.05 for 48 hours. Each gene was targeted with two sgRNAs. Dotted line indicates the limit of detection (LOD) of the assay and average titer of NTG lines tested. Significance was determined by a two-sided t-test comparing each GOF cell line to the NTG control cells, n = 3 biological replicates. e-f. Same as (c-d) but with additional proviral hits of interest. Significance was determined by a two-sided t-test comparing each GOF cell line to the NTG control cells, n = 3. Dotted line indicates the LOD of the assay and average titer of NTG lines. g. Western blot analysis of Calu-3 ACE2 GOF cells. One western blot was conducted in this experiment, then re-probed for β-actin as a loading control. For all panels: error bars denote mean ± s.e.m., significance is indicated as: n.s.=not significant, *= P < 0.05, **= P < 0.01, ***=P < 0.001, ****= P < 0.0001.