Long-term effects of Omicron BA.2 breakthrough infection on immunity-metabolism balance: a 6-month prospective study

There have been reports of long coronavirus disease (long COVID) and breakthrough infections (BTIs); however, the mechanisms and pathological features of long COVID after Omicron BTIs remain unclear. Assessing long-term effects of COVID-19 and immune recovery after Omicron BTIs is crucial for understanding the disease and managing new-generation vaccines. Here, we followed up mild BA.2 BTI convalescents for six-month with routine blood tests, proteomic analysis and single-cell RNA sequencing (scRNA-seq). We found that major organs exhibited ephemeral dysfunction and recovered to normal in approximately six-month after BA.2 BTI. We also observed durable and potent levels of neutralizing antibodies against major circulating sub-variants, indicating that hybrid humoral immunity stays active. However, platelets may take longer to recover based on proteomic analyses, which also shows coagulation disorder and an imbalance between anti-pathogen immunity and metabolism six-month after BA.2 BTI. The immunity-metabolism imbalance was then confirmed with retrospective analysis of abnormal levels of hormones, low blood glucose level and coagulation profile. The long-term malfunctional coagulation and imbalance in the material metabolism and immunity may contribute to the development of long COVID and act as useful indicator for assessing recovery and the long-term impacts after Omicron sub-variant BTIs.


Statistics
For all statistical analyses, confirm that the following items are present in in the figure legend, table legend, main text, or or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as as a discrete number and unit of of measurement A statement on on whether measurements were taken from distinct samples or or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of of all covariates tested
A description of of any assumptions or or corrections, such as as tests of of normality and adjustment for multiple comparisons A full description of of the statistical parameters including central tendency (e.g.means) or or other basic estimates (e.g.regression coefficient) AND variation (e.g. standard deviation) or or associated estimates of of uncertainty (e.g.confidence intervals) For null hypothesis testing, the test statistic (e.g.F, t, r) with confidence intervals, effect sizes, degrees of of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on on the choice of of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of of the appropriate level for tests and full reporting of of outcomes Estimates of of effect sizes (e.g.Cohen's d, Pearson's r), ), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Data analysis
Prof. George Fu Fu Gao, Prof. Yueyun Ma Ma and Dr. Xin Zhao Feb 21, 2024 Proteomics data: After fractionation, the peptides were lyophilized and separated using a C18 column (25 cm cm × 75 75 µm) on on an an EASY-nLCTM 1200 system (Thermo Fisher, Waltham, MA, USA).The flow rate was 300 nL/min and the linear gradient was set accordingly.The 4D-DIA mass spectrometry (MS) data for the library was acquired via the PASEF method as as follows: MS MS data were collected over an an m/z range of of 100 to to 1700, and during each MS/MS data collection, each TIMS cycle time was 1.1 s; s; each cycle included 1 MS MS and 10 10 MS/MS 100 msec TIMS scans; in in each of of the 10 10 PASEF MS/MS scans an an average of of 12 12 precursors were selected, resulting in an an MS/MS data acquisition rate of of 109 Hz.For the DIA, 56 56 DIA windows were acquired (automatic gain control target 3e6 and auto for injection time), and the collision energy was ramped linearly as as a mobility function from 59 59 eV eV at at 1/K0 = 1.6 Vs Vs cm cm2 to to 20 20 eV eV at at 1/K0 = 0.6 Vs Vs cm cm2.2. The MS/MS spectra were recorded from 100 to to 1700 m/z.Single-cell sequencing: the single-cell suspension was prepared in in water for cDNA library amplification using the 10× Genomics Chromium Next GEM Single Cell 5 Reagent Kits (version 2.0; Cat.No. 1000165).The Chromium™ Single Cell 5 Library Construction Kit (Cat.No. 1000020) was used to to construct the DNA library.The constructed library was then sequenced using PE150 sequencing on on an an Illumina Nova 6000 platform.T cell V(D)J and B cell V(D)J enrichment analyses were performed using the 10× Genomics Chromium™ Single Cell V(D)J Enrichment Kit, Human T Cell (Cat.No. 1000005) and Human B Cell (Cat.No. 1000016) according to to the manufacturer's instructions.The libraries were amplified using a Chromium TCR amplification kit (Cat.No. 1000252) and BCR amplification kit (Cat.No. 1000253), and the experiment was performed according to to the manufacturer's instructions.
Proteomics data: The default factory settings were used for the Spectronaut Pulsar™ 15.3.210906.50606(Biognosys, Swiss) search and library generation (including trypsyin/P as as the enzyme, up up to to two missed cleavages allowed Oxidation of of Me Me as as a variable modification, carbamidomethyl as as a fixed modification, and 1% 1% FDR for PSM, peptide, and protein identification).The DDA search results were imported into Spectronaut Pulsar™.The DIA data were analyzed with Spectronaut to to search the above constructed spectral library.The main parameters of of the software were set as as follows: the precursor Q-value cutoff and protein Q-value cutoff were set as as 0.01, the Normalization nature portfolio | reporting summary

April 2023
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers.We strongly encourage code deposition in a community repository (e.g.GitHub).See the Nature Portfolio guidelines for submitting code & software for further information.
Strategy was set as Local Normalization, and MS2 was used as Quantity MS-Level.The thresholds of fold change (>1.2 or <1/1.2) and p-value (P) < 0.05 and q-value <0.25 were used to identify DEPs.All identified proteins were annotated using GO (http://www.blast2go.com/b2ghome;http://geneontology.org/) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses (http://www.genome.jp/kegg/).Differentially expressed protein (DEP)s were used further for GO and KEGG enrichment analyses.Protein-protein interaction analysis was performed using the String (https://string-db.org/)software.

Single-cell sequencing: Single-cell expression data generation
The FastQC software was used to evaluate the data obtained to ensure the quality of the raw sequencing data.The raw data were mapped to the human reference genome (GRCh38, https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-2020-A.tar.gz) using CellRanger, which is a 10x genomics software that labels different mRNA molecules within each cell by identifying the barcode and UMI sequences for single-cell transcriptomic quantification.Single-cell immune repertoire data generation The UMI screening standard was supported by 400 paired reads.The reads that passed the mapping and UMI standards were used for contig splicing.Validity screening was also conducted on the barcodes and annotations were provided to remove the error information caused by artificial products.The concatenated contig was annotated and screened further to obtain a consensus sequence supported by the sample CDR3.Clonotype typing was performed based on CDR3 amino acid sequences of the obtained samples with consistent sequences.Single-cell data analysis Single-cell data were integrated and clustered using the Seurat R package (version 4) (https://satijalab.org/seurat/).A total of 124,541 cells were obtained from single-cell sequencing of the nine samples, and 108,306 cells remained after quality control.The cell quality control was conducted as follows: cells with a mitochondrial gene ratio exceeding 10% were removed, and only cells with gene numbers ranging from 500 to 4,500 and UMI numbers ranging from 800 to 16,000 were retained.DoubletFinder R package (https://github.com/chris-mcginnis-ucsf/DoubletFinder) was used to remove potential doublets, and further manually remove potentially marginalized doublets based on known classic markers.The filtered data were then standardized and normalized, and principal component analysis was performed on the top 2,000 genes with the highest coefficients of variation.The Harmony R package (https://github.com/immunogenomics/harmony)and the anchor module of Seurat were used to remove inter-batch effects between the samples and groups for cell clustering.Based on the elbow point and significance of the different principal components, the top 30 PCs were selected for subsequent cell clustering, and different resolutions were set to determine the cell clusters.Dimensionality reduction and visualization of single cells were performed using the Uniform Manifold Approximation and Projection (UMAP).The specific execution process of Seurat can be found on the website tutorial (https://satijalab.org/seurat/v4.0/pbmc3k_tutorial.html).

Cell type annotation
Using UMAP, all cells underwent dimensional reduction and were clustered in a two-dimensional space based on shared features.Firstly, the Azimuth algorithm was used to map the data to the reference cell set of PBMC, and then combined with specific high expression genes to manually determine the cell type.Specifically, classic biomarkers for specific cell types were used to identify the cells in different clusters.The FindAllMarkers function in Seurat was used to identify the 50 most highly expressed genes in each cluster of cells, providing a comprehensive understanding of cell types based on the top gene and literature.When clustering for the first or the second time, clusters expressing two or more classic markers and marginalized cells were considered doublets and excluded from subsequent analysis.Cell difference abundance analysis Use Milo algorithm to divide the cells of the control group and BA.2-BTI-6m group into different neighborhoods and calculate their spatial distribution differences, mapping them to different cell types.The key parameters for executing the Milo algorithm are k=10 and d=30.In addition, the proportion of cell types for each sample was calculated based on the conventional cell percentage and their differences between groups were calculated using the rank sum test.Differential gene identification and functional analysis The Findmarkers() function in the Seurat package was used to identify differentially expressed gene (DEG)s between distinct cell groups, using a standard of |logFC| > 0.25 and FDR < 0.01.DEGs only contain genes expressed in at least 25% of cells of the control group or infection group.The ClusterProfiler R package facilitated Gene Ontology and KEGG enrichment analyses and visualization of DEGs.Gene set activity score of individual cells The AddModuleScore() function of Seurat was used to calculate the activity scores of different gene sets in single cells.The gene set was sourced from the msigdb R package (Antigen processing and presentation (hsa04520), JAK_STAT_signaling (hsa04630), B cell activation (GO:0042113), B cell receptor signaling (GO:0050853), positive regulation of Treg activity (GO:0045591), response interferon (GO:0034341), protein processing (GO:0016485) and coagulation regulation (GO:0007597, GO:0050819, GO:0050820, GO:0050818).T cell toxicity activity was defined by the following gene sets: PRF1, IFNG, GNLY, NKG7, GZMB, GZMA, GZMH, KLRK1, KLRB1, KLRD1, CTSW, and CST7.The tissue specific gene set based on proteomics comes from the research of Gutmann et al. and Li et al.TCR/BCR analysis Using human GRCh38 as the reference genome, the Cell Ranger vdj pipeline was used to identify the TCR/BCR clonotype and quantify VDJ gene expression.For TCR, we only retained cells with at least one productive TCR chain (TRA) or TCR chain (TRB) for subsequent analysis.Where a cell had two or more paired TRA or TRB chains, we only retained the one with the highest basal expression.Clonotypes were defined based on their unique CDR3 amino acid sequence, and each unique TRA/TRB/TRA-TRB pair was defined as a clonotype.For BCR analysis, we retained only cells with at least one productive heavy chain (IGH) and IGK/IGL for subsequent analysis.When a cell had two or more paired IGH or IGK/IGL chains, only those with the highest basal expression were retained.Each unique pair IGH-IGK/IGL was defined as a clonotype.The scRepertoire R package (https://github.com/ncborcherding/scRepertoire)was used to analyze the single-cell immune repertoire and calculate the clonal diversity of the samples based on the aroma index.Based on the cell barcode information, clonotypes with TCR or BCR were mapped onto the cell UMAP map.Each sample from the same group was treated as as a biological duplicate.We We have matched sex and age for single-cell sequencing and proteomics analysis.The main parameters of of the software for proteomics data (Data Independent Acquisition (DIA) technique) were set as as follows: the precursor Q-value cutoff and protein Q-value cutoff were set as as 0.01, the Normalization Strategy was set as as Local Normalization.
Randomization is is not applicable in in this study, as as the patients were recruited retrospectively based on on the clinical diagnosis and treatment guideline.
For all the experiments, the investigators were blinded to to group allocation, as as well as as data analysis.For the statistical analysis, no no blinding was undertaken in in order to to deeply excavate the information contained in in the datasets.
Vero cells were purchased from ATCC (CCL81) and used for pseudotyped virus neutralization assay.
After purchasing the cell line from ATCC, we we have not done authentication as as cell lines in in our lab were administered by by a professional person and who is is responsible for quality control of of all cell lines.The Vero cells were not tested for Mycoplasma contamination because it it does not affect the result of of pseudotyped virus neutralization assay.
There was no no misidentified line used in in this study.
materials, systems and methodsWe We require information from authors about some types of of materials, experimental systems and methods used in in many studies.Here, indicate whether each material, system or or method listed is is relevant to to your study.If If you are not sure if if a list item applies to to your research, read the appropriate section before selecting a response.