Introduction

An estimated 65% of the US population had at least two SARS-CoV-2 infections by November 2022, but the impact of prior infection on disease course in subsequent infections has been debated1. Some evidence indicates SARS-CoV-2 infection provides a temporary reduction in re-infection risk2 and a durable reduction in the risk of COVID-19-related hospitalization and death3, while a handful of studies suggest that an initial SARS-CoV-2 infection may limit recovery from COVID-19 in later SARS-CoV-2 infections4. These contrasting findings may result from biases that can arise in population-level studies when differences in exposure history, vaccination status, and comorbidities are not fully accounted for. Controlling for such factors is a major challenge given geographically and temporally heterogeneous interventions, whereas examining the dynamics of SARS-CoV-2 infections at the individual level can facilitate adjusting for these biases.

Reverse transcription quantitative PCR (RT-qPCR) conducted from clinical samples collected at multiple time points during an infection offers an objective, quantitative metric of SARS-CoV-2 kinetics and can inform on key aspects of immune response and clinical progress. Such data have been used to specify how vaccination history, antibody titer and viral lineage together shape SARS-CoV-2 proliferation and clearance during an acute infection5, which in turn can inform the clinical management of COVID-196 and help interpret epidemiological trends7. Viral kinetics therefore offer a promising metric for clarifying the impact of an initial SARS-CoV-2 infection on subsequent infections and for translating those findings into medical and public health guidance.

The impact of vaccination and variant on SARS-CoV-2 viral kinetics have been well described elsewhere8,9,10,11,12. Infections with Delta lineages feature a higher peak viral concentration than Alpha or Omicron infections, and vaccination speeds up the clearance of SARS-CoV-2 across lineages11. However, the impact of SARS-CoV-2 infection-conferred immunity on peak viral concentration, viral proliferation, and viral clearance in subsequent infections is less well characterized. Furthermore, it has been unclear to what extent attributes of SARS-CoV-2 kinetics, such as peak viral load or clearance rate, persist across an individual’s successive infections.

Here, we collected and analyzed 94,812 SARS-CoV-2 RT-qPCR viral concentration measurements taken from longitudinal clinical samples in players, staff, and affiliates of the National Basketball Association (NBA) between March 11, 2020, and July 28, 2022. For the subset of individuals who were infected twice during the study period (n = 71), we measured changes in viral kinetics between first and second infections and determined the extent to which viral kinetic features persisted across infections.

Results

Summary of recorded infections

During the data collection period, 3346 infections were identified among 3021 individuals. These infections reflected the timing, intensity, and lineage composition of SARS-CoV-2 transmission in the broader United States. Of these infections, we identified 1989 “well-documented” infections that were sufficiently sampled to infer viral kinetics (Supplementary Table 1 and Supplementary Table 2), as defined by at least one RT-qPCR test with cycle threshold (Ct) value under 32 and three tests with Ct values under 4011. One individual had four total infections, and we omitted their third and fourth from the analysis. In total, there were 71 individuals who had two well-documented infections (Fig. 1 and Table 1). These 71 individuals were the primary focus of our analysis. We used a piecewise linear model, described previously11, to estimate the mean viral proliferation time (time from first PCR detectability to peak viral load), clearance time (time from peak viral load to the end of PCR detectability), and peak viral load (maximum viral concentration) in the well-documented infections. We adjusted these estimates by the age and vaccination status of the infected individual and by the viral variant category (Alpha, Delta, BA.1/BA.2, BA.4/BA.5, and other/unspecified). These adjustments were informed by the full set of 1989 well-documented infections (“Methods”).

Fig. 1: Onset times of repeat and overall infections in the dataset.
figure 1

A Histogram of first positive test dates for all recorded infections in full dataset (n = 3346). Colors in both panels correspond to the SARS-CoV-2 variant category (Other/Unspecified: Black; Alpha: Blue; Delta: Red; BA.1/BA.2: Magenta; BA.4/BA.5: Green), where Other/Unspecified include all non-Alpha, Delta, and Omicron lineages and any samples that could not be sequenced. B Date of the first positive test (points) for well-documented infections in individuals with two well-documented infections (n = 71). Horizontal lines connect the points that correspond to infections that belong to the same person.

Table 1 Infection characteristics for the 71 individuals with two well-documented infections

Second infections are cleared more quickly than first infections

For the 71 individuals with two well-documented infections, the mean adjusted clearance time of the first infection was 9.2 days (95% credible interval: 8.1, 10.3) vs. 6.3 days (5.3, 7.4) for the second infection (Fig. 2A, B). There was no significant difference between the proliferation time or peak viral load between first and second infections (Supplementary Table 3).

Fig. 2: Viral kinetics of first vs. second infections.
figure 2

A, B Mean posterior viral trajectory (solid lines) with 95% credible interval (shaded region) for well-documented first infections (A) and second infections (B) in the 71 individuals with two well-documented infections. C, D Mean posterior viral trajectory (solid lines) with 95% credible interval (shaded region) for all well-documented first infections (n = 1796, C) and second infections (n = 193, D) in the dataset. In all panels, gray points depict the measured viral concentration for a single test. For each person, the points were shifted horizontally so that the individual’s mean posterior peak viral concentration sits at day 0. Black points and whiskers (A, C) depict the mean and 95% credible interval for the proliferation time, peak viral concentration, and clearance time, from left to right, for first infections. These values are repeated in gray on the lower plots (B, D) to facilitate comparison with the viral kinetics of second infections.

The accelerated clearance time for second vs. first infections also held more generally. Across all first infections (n = 1796), the mean adjusted clearance time was 9.3 days (8.5, 10.2), while across all second infections (regardless of whether the first infection was well-documented in our dataset; n = 193), the mean adjusted clearance time was 6.6 days (5.8, 7.3) (Fig. 2C, D and Supplementary Table 4).

For the 71 individuals with two well-documented infections, we did not detect significant differences in viral kinetics of the second infection according to vaccination status (Supplementary Table 5). Again, this held more generally: across all well-documented second infections (n = 1796), we did not detect significant differences in viral kinetics according to vaccination status (Supplementary Table 6).

No evidence that the kinetics of a second infection differ according to the first infection’s lineage

For the 71 individuals with two well-documented infections, we did not detect significant differences in the viral kinetics of the second infection based on the variant category of a first infection (Supplementary Table 7). This finding also held more generally: for all individuals with a well-documented second infection (including those with and without a well-documented first infection; n = 193), the clearance time was similar regardless of the variant category of the first infection (Supplementary Table 8).

An individual’s relative clearance speed is roughly preserved across infections

For the 71 individuals with two well-documented infections, adjusted clearance times in first and second infections where correlated (Pearson correlation coefficient: 0.26 (0.09, 0.43); Spearman correlation coefficient, 0.30 (0.12, 0.46)). In contrast, we found no evidence of correlation between peak viral loads or proliferation times in first vs. second infections (Fig. 3 and Supplementary Table 9).

Fig. 3: Association between first- and second-infection viral kinetics.
figure 3

Scatterplots of the adjusted, model-estimated A peak viral load, B proliferation time, and C clearance time for second infections (vertical axis) vs. first infections (horizontal axis) in the 71 individuals with two well-documented infections. Each point depicts a single posterior draw for the corresponding viral kinetic parameter belonging to a single person’s first and second recorded infection, across a total of 200 draws. Dashed lines depict the least squares linear regression to these posterior values (points), such that a positive slope indicates a positive correlation between the viral kinetic parameter between first and second infections. Posterior estimates and 95% credible intervals for the slope are listed at the top of each plot, where “n.s.” denotes “not significant,” corresponding to a 95% credible interval that crosses 0. Contours aid in visualizing the density of the posterior draws.

Discussion

In individuals with multiple infections, second infections were cleared more quickly than first infections. Furthermore, one’s relative speed of clearing infection roughly persisted across infections. Those with a relatively fast clearance speed in their first infection tended to have a relatively fast clearance speed in their second infection, and vice versa. Thus, while prior infection and vaccination can modulate a person’s viral kinetics in absolute terms, there may also exist some further immunological mechanism, conserved across sequential infections, that determines one’s strength of immune response against SARS-CoV-2 relative to others in the population.

The mechanism underlying this persistence in clearance speed rank across subsequent infections is unclear. Some possibilities include the recency of exposure to a different related coronavirus (e.g., HKU1 or OC43)13, immune imprinting from early-lifetime exposure to certain coronavirus lineages14, or an inherent, genome-mediated aspect of immune response. It is also unclear whether one’s relative ability to clear SARS-CoV-2 infection generalizes to other coronaviruses or to other pathogens. Serological studies and genome-wide association studies may help to illuminate the mechanisms behind persistence in SARS-CoV-2 clearance time. Such studies would be valuable for improving our basic understanding of immune response to respiratory pathogens and for developing personalized clinical respiratory disease management protocols.

A consistent finding between this and other studies on SARS-CoV-2 viral kinetics is that prior antigenic exposure, through infection or vaccination, tends to speed up viral clearance, and thus to reduce the duration of test positivity5,11,15. The duration of viral positivity has various consequences both for clinical management and for public health surveillance. For clinical management, test results should be interpreted in the context of a patient’s immune history, which can modulate both the extent and expected duration of viral shedding5,16. It may also be possible to adjust the recommended duration of post-infection isolation based on infection history. When estimating epidemic growth rates using cross-sectional RT-qPCR test results, it is critical to account for immune-mediated shifts in the asymmetry between viral clearance to viral proliferation times, since this asymmetry is a key component in determining whether an epidemic is growing or shrinking7. We find that the difference between viral proliferation and viral clearance times decreases in second infections due to the shortened clearance time, which may reduce certainty in epidemic growth rates derived from cross-sectional RT-qPCR-based methods.

This study is limited by various factors. The cohort is predominately young, male, and healthy. While we adjusted for age, comorbidities and other underlying health factors were not measured. We were also unable to assess the relationship between measured viral concentrations and infectious virus. This study focuses primarily on individuals who were ultimately infected twice, and these individuals may differ in important immunological and behavioral ways from those who only underwent one infection during the study period. This underscores the need for further studies that capture viral and serological kinetics in tandem. Furthermore, only a small subset of individuals—71 of the over 3000 who underwent testing in this cohort—had two well-documented infections during the study period. Because of this, the statistical power of our analysis is limited, and might explain why we found no difference in the viral kinetics of second infections based on the lineage of the first infection. Larger studies are needed to verify whether such a link exists.

In conclusion, immunity from a first SARS-CoV-2 infection affects the viral kinetics of a second SARS-CoV-2 infection principally by speeding up viral clearance and thus shortening the overall time of acute infection. The kinetics of a second BA.1/BA.2 infection are unaffected by the lineage of the first infection. Individuals who quickly cleared their first infection also generally tended to quickly clear their second infection, despite a high degree of variation in individual clearance times, pointing towards persistence of underlying immune response across multiple infections. These findings help guide the interpretation of quantitative SARS-CoV-2 tests both clinically and for surveillance and point towards persistent individual-level immune mechanisms against SARS-CoV-2 that so far remain unexplained.

Methods

Study design

Between March 11, 2020, and July 28, 2022, the NBA conducted regular surveillance for SARS-CoV-2 infection among players, staff, and affiliates as part of an occupational health program. This included frequent viral testing (often daily during high community COVID-19 prevalence) using a variety of platforms, but primarily via nucleic acid amplification tests, as well as clinical assessment including case diagnosis and symptom tracking. To assess viral concentration, RT-qPCR tests were conducted when possible, using anterior nares and oropharyngeal swabs collected by a trained professional and combined into a single viral transport media. Cycle threshold (Ct) values were obtained from the Roche cobas target 1 assay. Ct values were converted to genome equivalents per milliliter using a standard curve16. Positive controls were run on every plate and the efficacy of the primer and probe sequences used in the assay were routinely monitored for mutations that would reduce assay sensitivity. Data on participant age and vaccination status were collected where possible. Viral lineages were assigned using whole-genome sequencing, when feasible. This resulted in a longitudinal dataset of 424,401 SARS-CoV-2 tests with clinical COVID-19 history and demographic information for 3021 individuals.

Vaccination and booster status was assigned at the time of the first positive test for each infection. Full vaccination corresponded to 14 days following either the second dose of a Pfizer or Moderna vaccine or the first dose of a Johnson and Johnson/Janssen vaccine. A person was considered “boosted” 14 days after an additional Pfizer or Moderna dose following their initial vaccination course.

Genome sequencing and lineage assignment

Whole genome sequencing of remnant diagnostic samples was performed to determine viral lineages using an overlapping amplicon-based library preparation strategy (i.e., Primal Seq). Following previously described methods, RNA was extracted from clinical samples and confirmed as SARS-CoV-2 positive17. Libraries were prepared in accordance with the selected sequencing platform. For samples sequenced on the Oxford Nanopore Technologies MinION platform, following amplicon generation samples were prepared for multiplex sequencing using the Ligation Sequencing Kit (SQK-LSK114) with Native Barcoding (SQK-NBD114.24). Final libraries were sequenced to a target of 100,000 reads per sample. For samples sequenced on Illumina platforms, libraries were prepared using the amplicon-based Illumina COVIDseq Test v033 with COVID-Seq ARTIC viral amplication primer set (V4, 384 samples, cat#20065135) and sequenced 2×74 on Illumina NextSeq 550 or 2×100 on the Illumina NovaSeq600 following Illumina’s documentation. The resulting FASTQ files were processed and analyzed on Illumina BaseSpace Labs using the Illumina DRAGEN COVID Lineage Application18; versions included were 3.5.0, 3.5.1, 3.5.2, 3.5.3, and 3.5.4. The DRAGEN COVID Lineage pipeline was run with default parameters as recommended by Illumina. Lineage assignment and phylogenetics analysis were accomplished using the most recent versions of Pangolin19 and NextClade20, respectively. Sequences are available at BioProject under accession number PRJNA1014408.

Estimating viral kinetic parameters

We characterized the viral kinetics of the well-documented infections by fitting a hierarchical piecewise linear model to the viral concentration measurements on a logarithmic scale (as measured by the PCR cycle threshold, or Ct), following previous methods5. The model captures the viral proliferation time (i.e., time from first possible detection to peak), peak viral concentration, and viral clearance time (i.e., time from peak to last possible detection) of acute SARS-CoV-2 infections. Using this approach, the viral kinetics of an infection can be described by three “hinge” points: (1) the theoretical time of first PCR positivity to, (2) the peak viral load δ (which occurs at time tp), and the theoretical time of last PCR positivity tr. According to this model, the expected viral load as measured by Ct units beyond the limit of detection, E[y], sits at the limit of detection prior to time to, then increases linearly to δ at time tp, then decreases linearly back to the limit of detection at time tr (Supplementary Figure 1). From these values, we can derive the proliferation ωp and clearance times ωr: ωp = tpto and ωr = trtp.

We characterized an individual i’s proliferation time ωp[i], clearance time ωr[i], and peak viral load δ[i] using the following formulae:

$${\omega }_{p[i]}={{{{\mathrm{Exp}}}}}\left[{\beta }_{p}+{\beta }_{p[c]}+\mathop{\sum}\limits_{a}{\beta }_{p[a]}+{\tau }_{p}{\eta }_{p[i]}\right]{\omega }_{p}^{\ast }$$
(1)
$${\omega }_{r[i]}={{{{\mathrm{Exp}}}}}\left[{\beta }_{r}+{\beta }_{r[c]}+\mathop{\sum}\limits_{a}{\beta }_{r[a]}+{\tau }_{r}{\eta }_{r[i]}\right]{\omega }_{r}^{\ast }$$
(2)
$${\delta }_{[i]}={{{{\mathrm{Exp}}}}}\left[{\beta }_{\delta }+{\beta }_{\delta [c]}+\mathop{\sum}\limits_{a}{\beta }_{\delta [a]}+{\tau }_{\delta }{\eta }_{\delta [i]}\right]{\delta }^{\ast }$$
(3)

Re-arranging yields the following equations:

$$\log \left({\omega }_{p[i]}/{\omega }_{p}^{\ast }\right)={\beta }_{p}+{\beta }_{p[c]}+\mathop{\sum}\limits_{a}{\beta }_{p[a]}+{\tau }_{p}{\eta }_{p[i]}$$
(4)
$$\log \left({\omega }_{r[i]}/{\omega }_{r}^{\ast }\right)={\beta }_{r}+{\beta }_{r[c]}+\mathop{\sum}\limits_{a}{\beta }_{r[a]}+{\tau }_{r}{\eta }_{r[i]}$$
(5)
$$\log \left({\delta }_{[i]}/{\delta }^{\ast }\right)={\beta }_{\delta }+{\beta }_{\delta [c]}+\mathop{\sum}\limits_{a}{\beta }_{\delta [a]}+{\tau }_{\delta }{\eta }_{\delta [i]}$$
(6)

The left-hand side of these equations are the logged multiplicative factor between the individual-level parameter value (indexed with subscript [i]) and a fixed, baseline value for these parameters (marked with *); for example, a proliferation time ωp[i] of 6 days relative to a baseline value ωp* of 3 days would yield a logged multiplicative factor of log(6/2) ≈ 0.7. The choice of baseline value is arbitrary and is included here to improve the robustness of the MCMC algorithm by setting the parameters on a similar scale.

The remaining coefficients (β, τ, η) are estimated from the data. The summed β coefficients constitute the population mean for the associated parameter. Thus, the unadjusted population mean proliferation time, clearance time, and peak viral load are represented by βp, βr, and δ, respectively. These unadjusted means are adjusted according to the cardinality of infection (first or second, represented by β values with subscript [c]) and the age group, variant category, and vaccination status of the individual (represented by β values with subscript [a]). Together, these β values constitute the upper level of the hierarchical model.

The individual-level effects are obtained by multiplying τ, the standard deviation of the population distribution, and η, which is a standard normal random variable drawn independently for each person i. This follows the non-centered model parameterization for hierarchical models advocated by Gelman et al.21.

Each β coefficient was assigned a Normal(0,1) prior distribution. This prior was chosen because, after exponentiating and multiplying by the fixed baseline values (in Eqs. S1–S3), the middle 98% of the prior distribution corresponds to a range of roughly one-tenth to ten times the baseline value. For example, for a fixed baseline proliferation time of ωp* = 3 days, the middle 98% of the prior unadjusted population mean distribution (corresponding to Exp[βp] × ωp*) would cover a range of roughly 0.3 days to 30 days. Thus, we considered these to be broad, minimally informative priors. The qualitative findings from the main text were unchanged when using narrower priors of Normal(0, 0.25).

Similarly, we specified a Normal(0,1) distribution, truncated to be non-negative, as the priors for the τ coefficients. With this choice, the individual-level draws could have a standard deviation up to 10 times larger than the mean, population-level distribution (i.e., the distribution of the summed β values), following the same logic as before.

As in prior work5, we characterized the likelihood of observing a given ΔCt(t) using the following mixture model:

$$L(\,{y}_{[it]}{|{{{{{\rm{\delta }}}}}}}_{[i]},{t}_{p[i]},{{{{{{\rm{\omega }}}}}}}_{p[i]}{{{{{{\rm{\omega }}}}}}}_{r[i]})= (1-{{{{{\rm{\lambda }}}}}})[\,\,{f}_{N}(x|E[\,{y}_{[it]}|{{{{{{\rm{\delta }}}}}}}_{[i]},{t}_{p[i]},{{{{{{\rm{\omega }}}}}}}_{p[i]},{{{{{{\rm{\omega }}}}}}}_{r[i]}],{{{{{\rm{\sigma }}}}}}) \\ +{I}_{lod}{F}_{N}(0|E[\,{y}_{[it]}|{{{{{{\rm{\delta }}}}}}}_{[i]},{t}_{p[i]},{{{{{{\rm{\omega }}}}}}}_{p[i]},{{{{{{\rm{\omega }}}}}}}_{r[i]}],{{{{{\rm{\sigma }}}}}})] \\ +{{{{{\rm{\lambda }}}}}}{f}_{Exp}(x|k)$$
(7)

The left-hand side of the equation denotes the likelihood (L) of observing a given viral load for person i at time t, y[it], as measured by Ct deviation from the limit of detection, given the model parameters δ (peak viral load), tp (time of peak viral load), ωp (proliferation time), and ωr (clearance time) for individual i and time t. Recall that E[y[it] | δ[i], tp[i], ωp[i], ωr[i]] is the expected viral load for person i at time t as specified by the viral kinetic model given the parameters. Here, σ denotes the observation noise, i.e., the variation in observed vs. expected (model-derived) viral load for a person at a given time point. This noise is also estimated from the data, using a prior distribution of σ ~ Normal(0,1), truncated to nonnegative values. This roughly covers a range of +/– 2.5 Ct for the measurement error, which falls within the range of measurement error based on repeated viral load measurements from previous studies in the same cohort11.

The likelihood captures two distinct processes: the viral kinetic process, denoted by the bracketed term preceded by a (1−λ); and false negatives, denoted by the term preceded by a λ. In the bracketed term representing the modeled viral kinetic process, fN(x | E[y], σ(t)) represents the Normal PDF evaluated at x with mean E[y] (generated by the model equations above) and observation noise σ(t). FN(0 | E[y], σ(t)) is the Normal CDF evaluated at 0 with the same mean and standard deviation. This represents the scenario where the true viral load goes below the limit of detection, so that the observation sits at the limit of detection. Ilod is an indicator function that is 1 if y = 0 and 0 otherwise; this way, the FN term acts as a point mass concentrated at y = 0. Last, fExp(x | κ) is the Exponential PDF evaluated at x with rate κ. We set κ = log (10) so that 90% of the mass of the distribution sat below 1 Ct unit and 99% of the distribution sat below 2 Ct units, ensuring that the distribution captures values distributed at or near the limit of detection. We did not estimate values for λ or the exponential rate because they were not of interest in this study; we simply needed to include them to account for some small probability mass that persisted near the limit of detection to allow for the possibility of false negatives.

Model parameters were fit using a Hamilton Monte Carlo algorithm implemented in R (version 4.1.2) and Stan (version 2.21.3). Four chains were run for 2000 iterations each, and the first half of each chain was discarded as burn-in, yielding 4000 total posterior draws. Convergence was assessed using a Gelman–Rubin statistic of <1.1 for all parameters and the absence of divergent transitions. Code for the full analysis is available at https://github.com/skissler/Ct_SequentialInfections.

Statistical approach

We assessed differences in viral kinetic parameters across category subsets (e.g., for first vs. second infections) by subtracting the relevant posterior draws and measuring the posterior probability mass of these differences that sat above/below 0, depending on the scenario. When fewer than 5% of these differenced posterior draws sat above/below zero, we took this as evidence of a significant difference.

To assess relative persistence in individual-level viral kinetic attributes across infections, we measured both the Pearson (raw) and Spearman (rank-based) correlations between the adjusted first-infection and second-infection proliferation time, clearance time, and peak viral load at the individual level. To perform the adjustment, we subtracted the model-estimated adjustments for age, variant, and vaccination status, leaving only the effects from infection cardinality and individual variation. We measured the Pearson and Spearman correlation for each of the 4000 draws from the posterior distribution generated by the Hamiltonian Monte Carlo fitting approach. This yielded a mean and 95% credible interval for the Pearson and Spearman correlations between each of the first- and second-infection viral kinetic parameters.

Study oversight

This work was approved as “research not involving human subjects” by the Yale Institutional Review Board (HIC protocol # 2000028599), as it involved de-identified samples. This work was also designated as “exempt” by the Harvard Institutional Review Board (IRB20-1407). Informed consent for virological testing and anonymized analysis of the results was obtained from all participants.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.