Temporal omics analysis in Syrian hamsters unravel cellular effector responses to moderate COVID-19

In COVID-19, immune responses are key in determining disease severity. However, cellular mechanisms at the onset of inflammatory lung injury in SARS-CoV-2 infection, particularly involving endothelial cells, remain ill-defined. Using Syrian hamsters as a model for moderate COVID-19, we conduct a detailed longitudinal analysis of systemic and pulmonary cellular responses, and corroborate it with datasets from COVID-19 patients. Monocyte-derived macrophages in lungs exert the earliest and strongest transcriptional response to infection, including induction of pro-inflammatory genes, while epithelial cells show weak alterations. Without evidence for productive infection, endothelial cells react, depending on cell subtypes, by strong and early expression of anti-viral, pro-inflammatory, and T cell recruiting genes. Recruitment of cytotoxic T cells as well as emergence of IgM antibodies precede viral clearance at day 5 post infection. Investigating SARS-CoV-2 infected Syrian hamsters thus identifies cell type-specific effector functions, providing detailed insights into pathomechanisms of COVID-19 and informing therapeutic strategies.

Correspondence of differentially expressed genes in proteomic and bulk RNA sequencing data was explored by linear regression (intersection at 0) of log transformed fold changes of infected vs control (5 dpi, p-value < 0.01). Linear fit (blue line) showed almost ideal consistency between regulations (slope = 0.8) and a correlation coefficient r = 0.9 (r 2 = 0.82). Standard error of the estimate is shown as the light grey stripe surrounding the blue line. (g) Temporal evolution of gene ontology/biological process terms connected with immune system response in lung tissue (left part) and in serum (right part), for the indicated time points compared to samples from uninfected animals. Enriched terms were filtered for terms mentioning "immune", "complement", "cytokine", "interferon", "neutrophil", "macrophage", "T cell" and "B cell" and attained false discovery rate (fdr) below 0.2 at least at one dpi in lung or serum.
Interestingly however, this was not the case for blood samples (Supplementary Note Fig. 1, right), where, apart from day 5, samples did not cluster per time point. As detailed below, we took sex into account as a covariate in the proteomics PCA, which was however not possible for transcriptomics due to the limited number of samples. Fig. 1: Principal component analysis (PCA) of the bulk RNA-sequencing data of lung (left) and blood (right) data using the multi-dimensional scaling function from the edgeR package 6 . Female animals are printed in italics. Sample 5 days post infection (dpi) -n°3 from blood could not be sequenced due to RNA degradation. LogFC: log fold change.

PCA on bulk proteomics data.
Similarly, on the proteome level, the disease effect drives the variance in the dataset.
Serum samples from infected animals (2, 3, and 5 dpi) cluster together and are separate from control samples immediately at 2 dpi. In contrast, the response in lung is delayed, peaks at 5 dpi and largely resolves until 14 dpi (see Supplementary Note   . PCA loading plots are presented. The gender effect (arrows is_f, is_m) is small and independent from disease effect (arrows is_Infec, is_Cont). Dim: dimensional reduction plot.

Assessing viral replication from sequencing data.
In order to investigate, whether the single-cell sequencing data could provide evidence for viral replication in specific cell types, we applied two previously published the distance between first and third quartile. Individual data points are shown by circles (2 dpi) and triangles (3 dpi).
In this data, no significant difference was seen between cell types for viral RNA, but also not between viral RNA and Eef1a1 and Eef2. The presence of negative-strand RNA could therefore also be due to a limited specificity of strandedness due to technical limitations of library preparation and sequencing.
In summary, we conclude that determining productive replication based on single-cell RNA-sequencing data from complex samples is not achievable with confidence in our data. However, accumulating literature indicates that replication happens primarily in epithelial cells, and to a much smaller degree in e.g. macrophages (see also [1] for RNA scope analysis in macrophages). All code is available at the dedicated github page (https://github.com/Berlin-Hamster-Single-Cell-Consortium/Single-cellsequencing-of-COVID-19-pathogenesis-in-golden-Hamsters-). We therefore decided to globally extend all 3'-UTRs by 1000 bp, using the code The elongated 3'-UTRs therefore allowed the assignment of about 210,000 more reads (3.6%) in the bulk RNA-sequencing.

Elongation of 3'-UTRs in the
Since, as visible above, the scRNA-seq sequencing reads are particularly prone to not be counted if the 3'-UTR were too short, we plotted, again for 2 dpi, animal 2, the single-cell RNA-seq gene count over all cells for the original annotation and the one with the elongated 3'-UTRs (Supplementary Note Fig. 7). We found about 5,000 genes to have an increase of 10 % or more counts using the modified annotation, with only a few having less counts.
8 Supplementary Note Fig. 7: Density plot of log(2) transformed gene counts in the scRNAseq data from day 2, animal 2, using the original annotation (horizontal axis) and the annotation with elongated 3'-UTRs (vertical axis).
Overall, this data indicates that until a more precise annotation is available, the global elongation of all 3'-UTRs by 1,000 bp provides a reasonable proxy for a better gene expression profiling.