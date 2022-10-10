Cloning SARS-CoV-2 ORFs

Two independent SARS-CoV-2 vORF collections were constructed in Gateway entry vectors. The Y2H GFP collection60 includes all but one (NSP11 was too short for Gateway cloning) codon-optimized ORF of SARS-CoV-2, synthesized based on a published genome61, which were cloned with and without stop codon, to enable C-terminal fusions. The Y2H HIS3 entry clone collection is based on National Center for Biotechnology Information (NCBI) accession number NC_045512.2 and annotation62. Y2H HIS3 vORFs were synthesized by Twist Bioscience without codon optimization and included 5´ and 3´ linkers with SfiI restriction sites. The 5´ linker incorporates a translational start ATG flanked by BamHI sites; the 3´ linker provides a stop codon flanked by PacI and AsiSI restriction sites. For Y2H HIS3 , vORFs were cloned into pENTR223.1 using SfiI restriction cloning, and the alternative ATG was removed by BamHI digest. A total of 28 vORFs were synthesized for Y2H GFP and 27 for Y2H HIS3 : NSP1-16 (except NSP11), S, E, M, N and ORFs 3A, 3B, 3D, 6, 7A, 7B, 8, 9B, 9C and 1062,63,64 (Supplementary Table 1).

Y2H HIS3 vORF entry clones were verified by full-length Sanger sequencing. As NSP10 had a one-base deletion, it was excluded from further experiments. vORFs were moved to the destination vectors pPC86 (N-terminal AD fusion, CEN origin)3,65 and pHiDEST-DB (N-terminal DB fusion, CEN origin)4 by Gateway cloning and confirmed by PCR. For Y2H GFP , barcoded ‘prey’ (pAR068: C-terminal AD fusion, 2μ origin/pHiDEST-AD: N-terminal AD fusion, CEN origin), and ‘bait’ (pHiDEST-DB: N-terminal DB fusion, CEN origin) destination vectors were generated using published protocols4, with the integration of the barcode locus at the SacI restriction site as described26. Single barcoded plasmid containing colonies were picked, arrayed into 384-well plates with 80 μl LB agar supplemented with 100 μg ml−1 carbenicillin and 35 μg ml−1 chloramphenicol (LB + Carb+CM) per well and incubated at 37°C for 16 h. Barcode sequences were identified using a modified Kiloseq procedure66 using an Illumina NextSeq 500 and analyzed as previously described4,26,66. Y2H GFP vORFs and human ACE2 were moved by Gateway cloning into barcoded destination plasmids4,26 pHiDEST-AD (N-terminal AD fusion, CEN origin (low copy number)) and pHiDEST-DB (N-terminal DB fusion, CEN origin (low copy number)) such that each ORF was linked to two to six barcodes in every configuration. Gateway cloning was performed individually and for ORF–barcode pairs using Sanger sequencing (TCAG, The Hospital for Sick Children) (Supplementary Table 13).

Generation of HuSCI HIS3

The Y2H HIS3 screening pipeline is essentially as previously described65. AD-Y and DB-X vORFs were transformed into yeast strains Y8800 (MATa) and Y8930 (MATα), respectively. NSP1 autoactivated as DB fusion and not screened in this orientation. DB-X vORFs were individually mated with 99 pools of ~188 AD-tagged human ORFs each, from human ORFeome v9.1 comprising 17,472 ORFs26,67 (hORFeome9.1). For the reverse orientation, yeast with 27 AD-Y vORFs were pooled and mated against DB-X hORFeome9.1. Primary screening in both configurations was performed twice to increase sampling sensitivity. Unless otherwise noted, all yeast incubations are at 30°C, overnight without shaking.

For primary screening, saturated haploid AD-Y and DB-X yeast cultures were spotted on top of each other on yeast extract peptone dextrose (YEPD) agar (1%) plates and incubated for 24 h. Yeast were replica plated onto selective synthetic complete media lacking leucine, tryptophan and histidine (SC-Leu-Trp-His) + 1 mM 3-AT (3-amino-1,2,4-triazole)3,65 (3-AT plates) and incubated for 72 h. From growing spots up to three colonies were picked and cultured in SC-Leu-Trp liquid medium for 2 d. For second phenotyping, cultures were spotted on diploid selection plates, incubated for 2 d and replica plated on 3-AT-plates and SC-Leu-His + 1 mM 3-AT + 1 mg per liter cycloheximide plates to identify spontaneous DB-X autoactivators2. Positive scoring colonies (growth on 3-AT-plates, no growth on cycloheximide plates) were picked, and ORFs were identified by Sanger sequencing65. For threefold verification, yeast strains corresponding to the identified human interaction partners were picked from archival glycerol stocks, cultured in liquid medium and mated (as described above) one-by-one against all vORFs, processed as described above and then scored. Colony growth was scored using a custom dilated convolutional neural network68. For training, previous datasets of more than 1,500 images of biochemically and functionally validated binary Y2H studies were used3. Each image was scaled to achieve equal pixel distance between the yeast spots of different images. The images were cropped and sliced, and the mean grayscale image of all spots on a plate was calculated. With this dataset, a simple front-end prediction module was trained consisting of six dilated convolutional layers with exponential increasing dilation rate and two dense layers at the end. After each layer except the last, a Leaky-ReLU activation was added69. The model was optimized with a combination of Softmax and Cross entropy and an Adam Optimizer70. The model achieved an accuracy >0.9 during all folds of a tenfold cross-fold validation. All positive scores were confirmed by a trained researcher. The verification step was done in triplicate and protein pairs scoring positive in at least two repeats were considered bona fide Y2H interactors. One representative colony of all interaction pairs was picked from selective plates to confirm the identities of X and Y by Sanger sequencing65.

Generation of HuSCI GFP

Barcoded ORFeomes

The barcoded human ORFeome consisting of 16,747 fully sequence-verified human ORFs with ~95% ORFs represented by two unique barcodes was previously described26. The barcoded bait and prey collections were arranged into a 10-by-10 screening matrix consisting of 10 DB and 10 AD groups, each containing ~1,400 ORFs with two distinct sets of unique barcodes, and ~200 ORFs with a single unique barcode set. Barcoded SARS-CoV-2 plasmids were transformed individually into RY3011 (AD plasmids) and RY3031 (DB plasmids) (genotypes in Supplementary Table 14). Transformed colonies were copied on fresh plates, incubated, scraped off and pooled to make glycerol stocks of all the barcoded SARS-CoV-2 ORFs plus the human ORF ACE2 in each plasmid configuration (with two or more barcodes per ORF).

Mating of pooled haploid yeast

Multiple pooled matings were performed using the frozen haploid pools. Each of the 10 human ORF pools (in C-terminal AD fusion plasmids with 2μ origin; pAR068) were separately mixed with the pool of SARS-CoV-2 ORFs plus human ACE2 (in N-terminal DB fusion plasmids with CEN origins; pHiDEST-DB). A separate mating was done between the SARS-CoV-2 pools in both AD and DB fusion, CEN origin plasmids (pHiDEST-AD, pHiDEST-DB). Negative controls were included in each mating and all matings were calculated to achieve >100× coverage of possible barcode combinations considering viability and mating efficiency. Procedurally, equal amounts of each haploid strain were mixed, the mixture was spread on 2x YEPD plus adenine agar plates (YPAD) and incubated for 24 h. Colonies on each mating plate were collected and re-spread across 20 15 cm SC-Leu-Trp plates supplemented with histidine (8 mM) and incubated for 72 h. These plates were then scraped off to make assay-ready pooled diploid glycerol stocks for each of the 11 groups.

Selection of yeast with interacting pair of DB-X and AD-Y by FACS

Pool of glycerol stocks were inoculated into 1-liter flasks with a starting vCFU of 30 M and incubated at 200 rpm for 24 h. Negative controls were started as 10 ml cultures and processed in parallel. ‘Presort’ cultures were prepared for each sample (2 × 10 ml cultures with OD 600 10) with doxycycline added (10 μg ml−1) to these cultures to induce barcode swapping while these cultures were incubated for 24 h4. To prepare for fluorescence-activated cell sorting (FACS), cells were concentrated by centrifugation (500 × g, 5 min) and resuspended in PBS to a final OD 600 of 10. Propidium iodide (4 mg liter−1) was added to identify dead yeast cells during FACS. Using the diploid negative control, the FACS gate for GFP-positive cells was set to capture 0.1% of GFP-negative cells, yielding a 0.01% false positive rate. Then, 100 million cells per group were sorted, and GFP-positive cells for each sample were plated on 10 SC-Leu-Trp+Ade+10x His (8 mM) plates and incubated for 72 h. Colonies were collected by scraping, centrifuged and resuspended into 2 × 10 ml cultures (OD 600 = 10). Doxycycline (10 μg ml−1) was added to induce barcode swapping, and cultures were incubated for 24 h, when plasmid DNA was extracted. Fused barcodes were PCR amplified with primers that attach modified Illumina i5 and i7 adapters to uniquely identify each sample. Following agarose gel analysis of PCR products, the bright band at ~350 bp was purified using a NucleoSpin Gel and PCR Clean-up kit. DNA concentrations were measured for each sample using a Qubit (Invitrogen, Q32851) and, guided by DNA concentration, samples were pooled to ensure equal sequencing depth relative to the number of protein pairs tested. After primer-dimer removal, DNA was quantified by qPCR, and the pooled NGS library was sequenced on an Illumina NextSeq using a mid- or high-output 150-cycles kit.

Read counting based on expected barcodes

The sequencing data were demultiplexed using bcl2fastq2 (v2.20.0.422) provided by Illumina with the following command: ‘bcl2fastq -r 10 -p 20 -w 10 –no-lane-splitting –barcode-mismatches 1 –adapter-stringency 0.7 –ignore-missing-bcls –ignore-missing-filter –ignore-missing-positions’. After demultiplexing, the fastq files were aligned to the group specific reference files using bowtie271 with the following parameters:

For read 1: -q –norc –local –very-sensitive-local -t -p 23 –reorder.

For read 2: -q –nofw –local –very-sensitive-local -t -p 23 –reorder.

Reference files contained expected barcode sequences for the ORFs in each group. After alignments, reads with mapping quality scores <20 were removed. Following successful BFG barcode recombination4, paired-end reads map to up-up or dn-dn when an interaction is present. The number of reads mapping to up-up and dn-dn were counted separately and merged as the final read count. The pipeline was implemented in Python v2.7.

Interaction scoring

For virus–host interactions, we used the product of marginal frequencies of bait and prey strains4 to estimate the abundance of each diploid bait–prey strain in the presort condition (‘PreSort’). The interaction score was defined by

$$\begin{array}{lll} {IS_{ij}} &=& {\frac{{f_{ij}^{\,GFP}}}{{f_{ij}^{\Pr eSort}}}}\\ {f_{ij}^{\, \Pr eSort}} &=& \mathop {\sum}\limits_i {c_{ij}^{\Pr eSort}/\mathop {\sum}\limits_j {\left[ {\mathop {\sum}\limits_i {c_{ij}^{\Pr eSort}} } \right]}}\\f_j^{\, \Pr eSort} &=& \mathop {\sum}\limits_j {c_{ij}^{\Pr eSort}} /\mathop {\sum}\limits_i {\left[ {\mathop {\sum}\limits_j {c_{ij}^{\Pr eSort}} } \right]}\\ {f_j^{\, \Pr eSort}} &=& \max \left( {f_i^{\, \Pr eSort},f_{AD}^{\, Floor}} \right) \times \max \left( {f_j^{\, \Pr eSort},f_{DB}^{\, Floor}} \right) \\f_{AD}^{\, Floor} &=& 10^{ - 5}f_{DB}^{\, Floor} = 10^{ - 4}\\ {f_{ij}^{\, GFP}} &=& {c_{ij}^{GFP}/\mathop {\sum}\limits_{ij} {c_{ij}^{GFP}} } \end{array}$$

with the following variables: c, read count; i, AD barcode count; j, DB barcode count; f, frequency.

For every DB barcode, we used the 960 AD null barcodes to define the thresholds leading to a 1% false positive rate. An interaction was accepted as positive only if the ORF pair interaction score was above this threshold for two or more barcode pairs. For intraviral screening, we accepted as interactions those protein pairs for which the frequency of barcode pairs was 1,000 times greater than the median frequency of the corresponding DB barcode for three or more independent barcode pairs, similar to the scoring method previously used for BFG-Y2H with HIS3-based growth selection4.

Pairwise retesting

Candidate interaction pairs for HuSCI GFP were verified in a pairwise HIS3 growth-based Y2H assay as described above (Y2H HIS3 verification step), with minor modifications. Barcode replicates of candidate human AD-Y and viral DB-X were pooled prior to mating. vORFs NSP1 and NSP12 were omitted from this retesting due to DB autoactivation. After mating, colonies were replica plated on SC-Leu-Trp-His and 3AT-plates. After 72–96 h of yeast growth, these pairwise tests were scored according to the standardized scoring method used for the Y2H HIS3 screen3,65. Interaction pairs scoring ≥3 were considered bona fide Y2H interactions.

Estimating completeness using the interactome framework

Assay sensitivity (S a ) is defined as the fraction of true interactions that can be detected by a given assay. Sampling sensitivity (S s ) is defined as a fraction of detectable true interactions that can be recovered by the pipeline used. Overall sensitivity of a given screen S can be calculated as S = S a × S s . In pairwise settings S s = 1 and the assay sensitivity is given by the fraction of hsPRS-v1/v2 pairs that score positive. Y2H HIS3 was benchmarked previously5 and has an assay sensitivity of S a-HIS3 = 21.7%. Sampling sensitivity of Y2H HIS3 after two repeats in two orientations has been shown to be S s-HIS3 = ~60%65, yielding a screening sensitivity of S HIS3 = S a-HIS3 × S s-HIS3 = 0.217 × 0.6 = 13%. Given that Y2H HIS3 screen had a search space completeness of 83% (T HIS3 = 83%), the overall completion of HuSCI HIS3 is C HIS3 = T HIS3 × S HIS3 = 0.83 × 0.13 = 10.8%.

A different version of Y2H GFP using low-copy plasmids and N-terminally fused hybrid proteins (lcnY2H GFP ) was benchmarked using 84 pairs of hsPRS-v1 and 92 pairs of hsRRS-v1. Flow cytometry was used to score interactions based on percentage of singlets in GFP-positive gate, which was set using empty bait and prey constructs. In addition, lcnY2H GFP was benchmarked in a pooled setting using all possible combinations of proteins constituting 78 hsPRS-v2 and 77 hsRRS-v2 pairs supplemented with a set of 14 pairs of Y2H-positive controls defined as calibration set4. The experiment was carried out and interactions were scored as described above, except that no empirical null distribution was used. lcnY2H GFP recovered 12 out of 82 (S a-lcnGFP = 15%) hsPRS-v1 pairs when tested in a pairwise single bait–prey configuration and 8 of 92 (9%, S s-lcnGFP = 9/15 = 60%) hsPRS-v2 + calibration set pairs when tested in a pooled single bait–prey configuration, yielding S lcnGFP = S a-lcnGFP × S s-lcnGFP = 0.15 × 0.6 = 9%. It has been previously shown that using high-copy C-terminal fusions increases sensitivity by ~33% without affecting precision26. Thus, screening sensitivity of Y2H GFP was modeled from that of lcnY2H GFP as S GFP = S lcnGFP × 1.33 = 9% × 1.33 = 12%. Given that Y2H GFP covered 70% (T GFP = 70%) of all possible virus–human protein combinations, the completion level of the Y2H GFP dataset is C GFP = T GFP × S GFP = 0.70 × 0.12 = 8.4%. Only 4 out of 28 (14.2%) hsPRS-v1 pairs detected by the union of Y2H HIS3 and lcnY2H GFP were detected with both methods, indicating a high degree of orthogonality (that is, different detection profiles of the methods used). In addition, Y2H GFP implemented in this study includes further differences such as high-copy and C-terminal fusion constructs for human proteins. Therefore, we conservatively estimate 90% orthogonality between Y2H HIS3 and Y2H GFP (that is, ~90% of detected interactions are different: O HIS3+GFP = 90%). Thus, we estimate that the fraction of all true interactions captured by our merged interactome maps is C HIS3+GFP = (C HIS3 + C GFP ) × O HIS3+GFP ≅ (0.108 + 0.084) × 0.9 = 17.3%. Given the uncertainties associated with derivation of screening sensitivity, we estimate lower and higher bounds to be 15% (S GFP = 9%, excluding inferred gain in sensitivity due to high-copy C-terminal fusions) and 22% (S GFP = 13.5%, S s-HIS3 = 70% and O HIS3+GFP = 100%), respectively.

Pairwise Y2H testing of previously identified SARS-CoV-1 interactions

We identified 97 unique curated binary interactions with SARS-CoV-1 and human interaction partners8 (Supplementary Table 2). For 77 of these, reagents to test interactions with SARS-CoV-2 orthologues were available in the barcoded human ORFeome. These involved 63 human proteins, 60 of which were covered by two barcode sets and three by a single barcode set. These were tested according to the ‘pairwise retesting’ protocol (above). Successful interactions were indicated by colony growth of both replicates in either condition.

Pairwise Y2H testing with SARS-CoV-2 variants

Lineage-defining mutations for the SARS-CoV-2 ‘variants of concern’ as defined by the Centers for Disease Control and Prevention (Alpha, Beta, Gamma and Delta) were obtained from CoV-Spectrum72,73 and mapped to the SARS-CoV-2 reference genome (NCBI accession number NC_045512.2). To generate variant ORFs, Y2H HIS3 plasmids were used as template for mutation PCR (primers in Supplementary Table 12). Mutation PCR reaction products were transformed and sequence verified. Plasmids containing the desired mutation were directly transformed into yeast and processed in pairwise mating as described above. A complete list of mutations generated is shown in Supplementary Table 12. SARS-CoV-2 proteins for which interactions were identified in AD-fusions (N and E) were tested only against the identified interactors. All other variant proteins were tested against all HuSCI interactors. In total, 19 individual mutations in 14 unique variant proteins from 9 different viral proteins were tested. Four proteins with 8 cloned variants had interactors in HuSCI HIS3 , 1 protein with a single cloned variant had interactors in HuSCI GFP and 4 proteins with 5 variants had no HuSCI interactors.

yN2H validation

Using Gateway cloning, ORFs from the indicated subsets (Supplementary Table 3) were transferred into pDEST-N2H plasmids (pDEST-N2H-N1, -N2, -C1, and -C2) containing a LEU2 (N1/C1 vectors) or a TRP1 (N2/C2 vectors) auxotrophy marker and transformed into haploid Saccharomyces cerevisiae Y8800 (MATa) and Y8930 (MATα) strains. For cross-plate calibration, two protein pairs from the hsPRS-v2, with different N2H signal intensities, were included in duplicate on every plate (NCBP1/NCBP2 and SKP1/SKP2). Virus–human protein pairs were randomly distributed across the plates and tested together with hsPRS-v2/hsRRS-v2, which were in separate plates.

Overnight-grown haploid cultures were mated by mixing 5 μl of each haploid strain in 160 μl YEPD medium followed by overnight incubation. To measure background, all interactor ORFs were also mated with yeast with empty F1 or F2 plasmids. After mating, 10 μl culture each was inoculated into 160 μl SC-Leu-Trp and grown overnight, and then 50 μl was reinoculated into 1.2 ml SC-Leu-Trp and incubated for 24 h while shaking at 900 rpm. Cells were harvested (6,000 x g, 15 min), and the supernatant was discarded. Each yeast cell pellet was fully resuspended in 100 μl NanoLuc Assay solution6. Homogenized solutions were transferred into white flat-bottom 96-well plates and incubated in the dark (for 1 h at room temperature). Luminescence was evaluated for each sample with 2 s integration time. To score X–Y protein pairs, a normalized luminescence ratio (NLR) was calculated corresponding to the raw luminescence value of the tested pair (X-Y) divided by the maximum luminescence value from one of the two controls (X-Fragment 2 or Fragment 1-Y)6. The 1% RRS threshold was based on the vhRRS and determined using the R quantile function.

Enrichment of previously known, phospho-regulated or RNA-binding host targets

From IntAct8 (version: April 28, 2020), 2,151 human proteins reported to have binary interactions with any virus protein were defined as ‘previously known host targets’. 2,005 of these ORFs were interrogated by our experiment, and further considered. HuSCI contained 61 previously known host targets. 2,254 human proteins that change phosphorylation changes upon SARS-CoV-2 infection were identified from A549 and Vero E6 cell lines9,10, of which 2,007 were interrogated by our experiment and 37 are in HuSCI. 139 experimentally identified human proteins specifically bound to SARS-CoV-2 RNA (vRICs) and 335 human proteins with altered RNA-binding activity upon SARS-CoV-2 infection (cRICs) were obtained from a recent RNA-interactome study11. Then, 121 vRICs and 294 cRICs were interrogated by our experiment; 5 HuSCI proteins were vRICs, and 13 HuSCI proteins were cRICs. All the observations were tested for enrichment using Fisher’s exact tests and by permutation tests with 10,000 permutations.

GO enrichment analysis

gProfiler74 (database versions: Ensembl 104, Ensembl Genomes 51 and Wormbase ParaSite 15) was applied to identify enriched functional categories in HuSCI, AP-MS9,12,13,14,15 and BioID studies16,17,18. The hORFeome9.1, which was used for contactome mapping, served as the background for HuSCI, otherwise the universal annotated human genes. ‘Inferred from electronic annotations’ annotations were excluded. Adjusted P values were calculated using the Benjamini–Hochberg procedure. Functional terms with a hypergeometric P < 0.05 and term size between 5 and 1,000 were collected and enrichment calculated as the ratio between observed and expected gene counts. To categorize HuSCI host proteins, five meta categories inspired by the functional enrichment analysis results were used, namely ‘immune response’ (GO:0006955), ‘viral process’ (GO:0016032), ‘protein ubiquitination’ (GO:0016567), ‘cytoskeleton’ (GO:0005856) and ‘vesicle-mediated transport’ (GO:0016192). Human proteins related to these categories were obtained from the AmiGO 2 (ref. 75) (July 2021), and HuSCI host proteins were categorized based on their annotation to these meta categories.

Domain enrichment of host interacting proteins

Structural domains in human targets were identified from Pfam release 34.0 (ref. 76) (March 2021). Interactions of viral proteins with human interactors that have common domains were defined as shared-domain interactions and counted for HuSCI. The procedure was repeated for 1,000 randomized HuSCI networks (degree-preserved random rewiring). The significance of every viral protein–human domain was assessed by Fisher’s exact tests (Supplementary Table 6) using the number of V-D, V-!D, !V-D, and !V-D interacting pairs, in which V and D correspond to the viral protein and human domain of interest, and !V and !D to the rest of viral proteins and domains in the HuSCI network, respectively. We identified as enriched associations those with at least two V-D interactions and P < 0.05. We repeated the process for 1,000 randomized HuSCI networks (see above). Multiple domain copies in a given human protein were counted once.

NF-κB reporter assays

HEK293 (RRID: CVCL_0045, DSMZ) were cultured in complete DMEM (high glucose) supplemented with 10% fetal calf serum, 100 U ml−1 penicillin and 100 µg ml−1 streptomycin and maintained in humidified atmosphere at 5% CO 2 at 37°C. For the reporter assay, 1 × 106 HEK293 cells were seeded in a 60-mm cell culture dish one day before transfection. Transfection was done using the calcium phosphate protocol using 10 ng NF-κB reporter plasmid (6 × NF-κB firefly luciferase pGL2), 50 ng pTK reporter (Renilla luciferase) and expression vectors (Flag-IKKb (pRK5), Flag-A20 (pEF4) and SARS-CoV-2 constructs (pMH)) using a total of up to 6 μg DNA. Briefly, the DNA was diluted in 200 µl 250 mM CaCl 2 solution (Carl Roth, 5239.1), vortexed and added dropwise to 200 µl 2 × HBS (50 mM HEPES (pH 7.0), 280 mM NaCl, 1.5 mM Na 2 HPO 4 × 2 H 2 O, pH 6.93) while gently vortexing. After 15-min incubation at room temperature, the mix was added dropwise to cell culture dishes. Transfection media was replaced after 6-h incubation with complete DMEM. Then, 24 h after transfection cells were stimulated with 20 ng ml−1 TNF-α for 4 hours. Luciferase activity was measured using the dual luciferase reporter kit (Promega, E1980) according to the manufacturer’s protocol. The firefly and Renilla luminescence was determined with a luminometer (Berthold Centro LB960 microplate reader, software MikroWin 2010) and quantified in relative light units (RLU). NF-κB induction was specified as the ratio of firefly luminescence (RLU) to Renilla luminescence (RLU). Significance of relative NF-κB transcriptional activity was assessed via one-way ANOVA with Dunnett’s multiple comparisons. Data evaluation was performed in GraphPad Prism v7.04.

Protein expression was verified by western blot of lysates. Briefly, proteins were separated by SDS-PAGE and transferred on polyvinylidene fluoride membranes. Membranes were blocked with 5% milk in 1 × PBS + 0.1 % Tween-20 (PBS-T) for 1 h at room temperature. Primary antibodies in 2.5% milk in PBS-T were incubated overnight at 4°C, the membranes were washed three times with PBS-T and secondary antibodies were incubated (1.25% milk/PBS-T) for 1 h at room temperature. Anti-actin beta (SCBT, sc-47778), anti-FLAG M2 (Sigma-Aldrich, F3165) and anti-HA (Sigma-Aldrich, 11583816001, RRID:AB_514505) were used at a 1:1,000 dilution. Secondary antibody (Jackson ImmunoResearch, Jim-715-035-150) was used at a 1:10,000 dilution. For detection of horseradish peroxidase-catalyzed enhanced chemiluminescence, LumiGlo reagent (CST, 7003S) was used.

For generation of IKBKG KO HEK293 cells, oligonucleotides coding sgRNAs targeting exon 3 (5′-TGCATTTCCAAGCCAGCCAG-3′) or exon 2 (5′- GCTGCACCATCTCACACAGT-3′) were cloned into px458 (Addgene, 48138). HEK293 were transfected with 5 µg plasmid by standard calcium phosphate transfection. After one day, GFP-positive cells were sorted with a MoFlo cell sorter (Beckman Coulter, Cytomation) and seeded in 96-well plates at dilutions of 0.5–5.0 cells per well. Single-cell clones were expanded and screened for loss of IKBKG expression by western blot (RRID: AB_2124846). IKBKG-negative clones were verified by amplifying and sequencing a region of genomic DNA encompassing the sites targeted by PCR (exon 3: forward primer 5′-CTGGCCAACACGTACTTTTA-3′, reverse primer 5′-GGTTACGGTGAGCGAAGGCTC-3′; exon 2: forward primer 5′- CTGACATCTCCCTCCACAAAC-3′ and reverse primer 5′-GGAGCTGGAATGAACCTTCC-3′).

Functional effects on viral replication

Selection of host-target candidates

To evaluate if identified host targets are involved in viral replication, the following HuSCI proteins involved in host immune regulation77 and viral life cycle regulation51,78,79,80 by enriched GO terms in this study were selected: G3BP1, G3BP2, TRAF2, USP25, EIF2AK2, REL, IKBKG and KLC1.

Engineering of hACE2-expressing cells

A549 cells were seeded at 5 × 105 cells per well in six-well cell culture plates and cultured in DMEM with 10% FCS and 1% penicillin/streptomycin at 37°C and 5% CO 2 (standard media). After 24 h culture medium was replaced by fresh medium containing 4.5 × 107 transduction units hACE2 lentivirus per well and incubated for 4 hours at 37°C and 5% CO 2 . The lentiviral inoculum was then replaced with 2 ml DMEM 10% FCS and 1% penicillin/streptomycin. After 24 h, the transduction was repeated with the same steps as above. Cell surface expression of hACE2 was monitored by FACS using the AttuneNxT Flow Cytometer (Thermo Fisher Scientific) and results were analyzed with FlowJo v10 Software (BD Life Sciences). The resulting cells are referred to as A549-hACE2.

Generation of KO cell lines

KO cells were generated using the target-specific CRISPR-Cas9-HDR (homology-directed recombination) KO directed technology developed by Santa Cruz Biotechnology, which enables selection of KO cells with puromycin and red fluorescent protein (Supplementary Table 15). Briefly, A549-hACE2 cells were seeded at 2.5 × 106 cells in T25 flasks and standard media. After 24 h, cells were cotransfected with 7.5 µg each of KO and HDR plasmids for the previously described targets and 15 µg KO plasmid for the mock KO, from Santa Cruz Biotechnology using FuGene (Promega, E2312). After 72 h, KO cells were selected with 2 µg/ml puromycin (Invivogen, ant-pr-1) for 3 d, and mock KO cells were treated with the same volume of Hepes solution (Sigma-Aldrich, 51558). One week later, red fluorescent protein-positive cells were sorted by flow cytometry. DNA from 2 × 106 cells was extracted and region of interest was amplified for each KO, except KLC1, in a 25-µl PCR using 50 ng genomic DNA and using one primer in the genomic DNA and one primer in the insert (primers are listed in Supplementary Table 15). KLC1 KO was verified by amplifying the sg-directed Cas9 region that had no corresponding HDR with one primer on each side of the region; the PCR product was purified using Nucleospin Gel and PCR Clean-up (Machery-Nagel, 11992242) and KO confirmed by Sanger sequencing.

Assessment of SARS-CoV-2 infection in A549-hACE2 KO versus wild-type cells

Wild-type and KO A549-hACE2 cells were seeded at 1 × 106 cells per well in 12-well plates and standard media. After 24 h, cells were infected at a multiplicity of infection (MOI) of 10−3, with SARS-CoV-2 isolate hCoV19/France/GE1973/2020 (n = 3, biological replicates). Total RNA was extracted from infected cells at 72 h after infection, and SARS-CoV-2 replication was assessed by RT-qPCR using Orf1ab primers (5′-ATGAGCTTAGTCCTGTTG-3′; 3′-CTCCCTTTGTTGTGTTGT-5′) (n = 9, three technical replicates per biological replicate). GAPDH was used for normalization. Viral RNA was quantified according to the ∆∆Ct standard method81. The effect of gene KO on viral replication was determined using the wild-type ORF1ab RNA level as a control as shown in the following equation: 2−(∆∆Ct) = 2−(∆Ct KO − ∆Ct WT). Significance of the KO effect was calculated against the mock KO using an ordinary one-way nonparametric ANOVA Kruskal-Wallis with Dunn’s multiple comparisons test using GraphPad Prism v9.

Assessment of the viability of the KO cell lines

A total of 8.0 × 105 cells of each KO cell line were seeded in a white 96-well plate and incubated at 37°C and 5% CO 2 for 24 h. Cell media was replaced with DMEM and incubated at 37 °C and 5% CO 2 for 72 h. Cell viability was measured using Cell Titer-Glo Luminescent Cell Viability Assay kit (Promega, G7750). Luminescence was measured on a Centro XS luminometer (Berthold; integration time, 0.5 s). Wild-type cells served as the reference and significance of cell viability was calculated against the mock KO using an ordinary one-way nonparametric ANOVA Kruskal–Wallis with Dunn’s multiple comparisons test using GraphPad Prism v9.

Genes ranked by number of publications

Publication counts are derived from the gene2pubmed file from NCBI, downloaded on 16 November 2021. Only protein-coding genes were considered. For visualization, but not statistical assessment, of genes with equal numbers of publications, order was determined by random shuffling. P values were calculated by Mann–Whitney U test, with Bonferroni correction. Black dots indicate the mean; error bars represent the 95% confidence interval generated from 1,000 bootstrap samples.

Tissue specificity analysis

The Tissue Atlas dataset was obtained from the HPA database21 (version 2021.04.09). The HPA categories ‘tissue enriched’, ‘group enriched’ and ‘tissue enhanced’ were combined with ‘tissue-specific’, ‘low tissue specificity’ was denoted as ‘common’ and the ‘not detected’ category was not included in this analysis. A total of 11,069 of 19,670 genes (56.3%) in the HPA dataset were defined as tissue specific, and 8,385 of 19,670 genes (42.6%) showed common expression profiles. Tissue distribution differences were determined using Fisher’s exact test with Bonferroni correction.

SARS-CoV-2 organotropism data were obtained from post mortem examinations22,82. The RNA tissue-specific NX value (normalized transcripts per million) was extracted and used to denote whether the gene is specifically expressed in a given tissue. Tissues from the Tissue Atlas were combined into organ systems and used to assess host-target tissues. Significance was evaluated by Fisher’s exact test with Bonferroni correction.

Identification of genetic variation in host targets and network communities

Host network communities were identified using the OCG hierarchical community clustering algorithm on the Human Reference Interactome26,83 as implemented in the linkcomm R package (V1.0-13) using ‘centered cliques’ as initial class system84. A total of 3,603 communities with a minimum size of 4 were found, of which 204 contained a significant number of virus interactors (that is, were significantly targeted) (nominal P < 0.05, Fisher’s exact test; Supplementary Table 8). A community was annotated to a function if a GO term was enriched (FDR < 0.05) or if ≥20% or ≥30% of the annotated constituent proteins shared an annotation85 (Supplementary Table 8). From AP-MS-based association studies9,12,13,14,15, 57, 43, 18 and 17 significantly targeted communities were found, respectively (nominal P < 0.05, Fisher’s exact test; Supplementary Table 8).

Uniformly processed GWAS summary statistics were downloaded for 114 traits from the GTEx GWAS analysis41,86. MAGMA87 analysis was implemented in R 3.6.1 and consists of three steps: first, GWAS summary statistics across all single-nucleotide polymorphisms (SNPs) within a gene region are aggregated into a gene-level association P value. Next, the gene-level P value is transformed to a z-score (using the inverse normal cumulative distribution function). Finally, z-scores across all genes are modeled as a function of gene set membership and the default gene-level covariates (gene size in number of SNPs, the gene density (a measure of within-gene linkage disequilibrium), the inverse mean minor allele count) using a linear model. Association between gene set membership and GWAS z-scores is tested based on the null hypothesis beta = 0 for the coefficient associated with the gene set membership indicator variable. All targets, and the targeted network communities, were considered gene sets. Entrez gene IDs were used on the human genome assembly 38. Individual MAGMA analyses were performed for each trait based on summary statistics and linkage disequilibrium structure from the 1,000 genomes European reference panel always conditioning on default gene-level covariates (for example, gene length). For each gene set, standard error normalized beta coefficients constituted the association score, with larger values indicating greater chance of getting significant association. Following Benjamini–Hochberg multiple hypothesis correction, gene set–trait associations with FDR < 0.05 were selected. These pairs were subjected to follow-up analysis. SNPs localizing within genes of enriched gene sets were selected, and genes containing SNPs with GWAS P < 5.0 × 10−8 were selected for the enriched traits, which were considered ‘GWAS hits’. As control the analysis was repeated for the 3,399 network communities that were not significantly targeted (Supplementary Table 8). For both targeted and non-targeted communities the probability of observing traits that are linked to COVID-19 outcomes was assessed. A literature survey identified 35 traits clinically linked to COVID-19 (score 2 in Supplementary Table 8), 18 ‘related to immune function’ and 61 without connection. For the enrichment analysis we focused on the ‘COVID-linked’ traits; traits ‘related to immune function’ are also indicated in Fig. 3. Finally, Fisher’s exact test was used to assess the significance traits being linked to COVID-19 (score 2) vs not (scores 0 and 1) in traits that are associated with not-virus- targeted communities (P = 0.5) vs virally targeted communities (P = 0.01). For the control analysis of AP-MS targeted communities, only genetic variation related to COVID-19 severity was evaluated. The contactome-targeted communities with significant GWAS trait associations were numbered 1–31.

Small-molecule inhibition

Remdesivir (Bio-Techne, 7226/10) and USP25/28 inhibitor AZ1 (Bio-Connect, HY-117370-5mg) were dissolved in DMSO. HEK293-ACE2 and Vero E6 (3 × 104 cells per well) were plated in white 96-well plates. After 24 h, cells were infected with SARS-CoV-2 (ref. 54) (0.01 MOI) containing a nanoluciferase reporter and treated with the compounds in a 12-point twofold dilution series with 0–10 µM concentration. Each condition was done in triplicate, except for AZ1, which was done in quadruplicate for HEK293-ACE2 and one replicate for Vero E6. Cells were cultured for 24 h, and luminescence was quantified88. Cell viability was measured using the Cell Titer-Glo Luminescent Cell Viability Assay kit (Promega, G7750). EC 50 values were calculated via the variable slope model in GraphPad Prism v9.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.