A longitudinal study of the impact of university student return to campus on the SARS-CoV-2 seroprevalence among the community members

Returning university students represent large-scale, transient demographic shifts and a potential source of transmission to adjacent communities during the COVID-19 pandemic. In this prospective longitudinal cohort study, we tested for IgG antibodies against SARS-CoV-2 in a non-random cohort of residents living in Centre County prior to the Fall 2020 term at the Pennsylvania State University and following the conclusion of the Fall 2020 term. We also report the seroprevalence in a non-random cohort of students collected at the end of the Fall 2020 term. Of 1313 community participants, 42 (3.2%) were positive for SARS-CoV-2 IgG antibodies at their first visit between 07 August and 02 October 2020. Of 684 student participants who returned to campus for fall instruction, 208 (30.4%) were positive for SARS-CoV-2 antibodies between 26 October and 21 December. 96 (7.3%) community participants returned a positive IgG antibody result by 19 February. Only contact with known SARS-CoV-2-positive individuals and attendance at small gatherings (20–50 individuals) were significant predictors of detecting IgG antibodies among returning students (aOR, 95% CI 3.1, 2.07–4.64; 1.52, 1.03–2.24; respectively). Despite high seroprevalence observed within the student population, seroprevalence in a longitudinal cohort of community residents was low and stable from before student arrival for the Fall 2020 term to after student departure. The study implies that heterogeneity in SARS-CoV-2 transmission can occur in geographically coincident populations.


Design, setting, and participants. This human subjects research was conducted with PSU Institutional
Review Board approval and in accordance with the Declaration of Helsinki. The study uses a longitudinal cohort design, with two separate cohorts: community residents and returning students. We report on measures from the first two clinic visits for the community resident cohort and the first clinic visit for the returning student cohort.
To assist with recruitment into studies under the Data4Action (D4A) Centre County COVID Cohort Study umbrella, a REDCap survey was distributed to residents of Centre County where respondents could indicate interest in future study participation and provide demographic data. Returning students received a similar survey and were also recruited through cold-emails and word-of-mouth.
Individuals were eligible for participation in the community resident cohort if they were: ≥ 18 years old, residing in Centre County at the time of recruitment (June through September 2020); expecting to reside in Centre County until June 2021; fluent in English; and capable of providing their own consent. PSU students who remained in Centre County through spring and summer university closure were eligible for inclusion in the community resident cohort as they experienced similar geographic COVID-19 risks as community residents. Participants were eligible for inclusion in the returning student cohort if they were: ≥ 18 years old; fluent in English; capable of providing their own consent; residing in Centre County at the time of recruitment (October 2020); officially enrolled as PSU UP students for the Fall 2020 term; and intended to be living in Centre County through April 2021. In both cohorts, individuals were invited to participate in the survey-only portion of the study if they were: lactating, pregnant, or intended to become pregnant in the next 12 months; unable to wear a mask for the clinic visit; demonstrated acute COVID-19 symptoms within the previous 14 days; or reported a health condition that made them uncomfortable with participating in the clinic visit. Informed consent was obtained for all participants.
Upon enrollment, returning students were supplied with a REDCap survey to examine socio-behavioral phenomena, such as attendance at gatherings and adherence to non-pharmaceutical interventions, in addition to information pertaining to their travel history and contact with individuals who were known or suspected of being positive for SARS-CoV-2. Community residents received similar surveys with questions relating to potential SARS-CoV-2 household exposures. All eligible participants were scheduled for a clinical visit at each time interval where blood samples were collected.
Outcomes. The primary outcome was the presence of S/RBD IgG antibodies, measured using an indirect isotype-specific (IgG) screening ELISA developed at PSU 20 . An optical density (absorbance at 450 nm) higher than six standard deviations above the mean of 100 pre-SARS-CoV-2 samples collected in November 2019, determined a threshold value of 0.169 for a positive result. Comparison against virus neutralization assays and RT-PCR returned sensitivities of 98% and 90%, and specificities of 96% and 100%, respectively 21 . Further details in the Supplement. The presence of anti-SARS-CoV-2 antibodies has been documented in prior seroprevalence studies as a method of quantifying cumulative exposure 22-24 . Statistical methods. Community resident and returning student cohorts' seroprevalence are presented with binomial 95% confidence intervals. We estimated each subgroup's true prevalence, accounting for imperfect sensitivity and specificity of the IgG assay, using the prevalence package in R 25 . We calculated a 95% binomial confidence interval for test sensitivity of the IgG assay for detecting prior self-reported positive tests in the returning student cohort (students had high access to testing from a common University provider) with a uniform prior distribution between these limits. Prevalence estimates were then calculated across all possible values of specificity between 0.85 and 0.99. Estimates were not corrected for demographics as participants were not www.nature.com/scientificreports/ enrolled using a probability-based sample. We assessed demographic characteristics of the tested participants relative to all study participants to illustrate potential selection biases (Table 1). Missing values were deemed "Missing At Random" and imputed, as described in the Supplement. We estimated the adjusted odds ratios (aOR) of IgG positivity in the student subgroup using multivariable logistic regression implemented with the mice and finalfit packages 26,27 , two-sided Chi-squared tests for raw odds ratios (OR), and Welch Two Sample t-test for continuous distributions, and present 95% confidence intervals. We considered the following variables a priori to be potential risk factors as they increase contact with individuals outside of a participants' household [28][29][30][31] : close proximity (6 feet or less) to an individual who tested positive for SARS-CoV-2; close proximity to an individual showing key COVID-19 symptoms (fever, cough, shortness of breath); attendance at a small gathering (20-50 people) in the past 3 months; attendance at a medium gathering (51-1000 people) in the past 3 months; lives in University housing; ate in a restaurant in the past 7 days; ate in a dining hall in the past 7 days; only ate in their room/apartment in the past 7 days; travelled in the 3 months prior to returning to campus; and travelled since returning to campus for the Fall term.
We estimated the aOR of IgG positivity at either time point in the returning community subgroup, with the following risk factors determined a priori to the study's inception: being a PSU employee; and the amount of contact with PSU students when "Stay at home" orders are not in place (self-reported on a scale of 1-10). BIC and AIC were used to evaluate the contribution of the variables to the model.
All statistical analyses were conducted using R version 4.1.2 (2021-11-01) 32 , with a pipeline created using the targets package 33 .

Demographics.
A total of 9299 community residents were identified through an initial REDCap survey that collected eligibility, demographic, and contact information. 1531 were eligible, indicated willingness to participate, and were enrolled. 1462 completed a first clinic visit between 07 August and 02 October 2020, and 1313 of those completed a second clinic visit between 30 November and 19 February 2020 and for whom both visit Table 1. Demographic characteristics of study participants. Non-D4A participants are all participants in the initial anonymous survey from which Data4Action participants were drawn. D4A participants are divided into subsets for which antibody assays were conducted (N = 1313) and those for which assays were not conducted (N = 218). a Asian; Hispanic, Lantino/a, or Spanish; Black or African American; Middle Eastern or North African; Native American or Alaska Native; other race or ethnicity. This category is aggregated to protect participant identities because no single group comprised > 4% of participants.
Prefer not to answer 0 (0%) 0 (0%) 0 (0%) www.nature.com/scientificreports/ 1 and visit 2 samples were analyzed. 1410 returning students were recruited using volunteer sampling and 725 enrolled; of these, 684 completed clinic visits for serum collection between 26 October and 21 December 2020. Among participants with serum samples: the median age of community residents was 47 years (IQR 36-58), with 86.5% between the ages 18-65 years, and for the returning students the median age was 20 years (IQR 19-21), with 99.7% between the ages 18-65 years; 66.9% of the community residents identified as female and 32.3% as male; 64.5% of the returning students identified as female and 34.6% as male; 92.9% of the community residents identified as white, as did 81.9% of the students. Similar proportions were seen in those enrolled without samples, and among the initial REDCap survey respondents (Tables 1, 2). Although all county residents were eligible for participation, 74.9% of community resident participants were from the 5 townships (College, Ferguson, Harris, Half Moon, Patton) and 1 borough (State College) that form the "Centre Region" and account for ~ 59% of Centre County's population 17 (Fig. 1). The median household income group in the community residents providing samples was $100,000 to $149,999 USD (IQR: $50,000 to $74,999; $150,000 to $199,999). The median household income in the county is $60,403 17 . 47.4% of the county is female, 87.9% white, and 70.3% are between the ages of 18-65 years old 17 . The study cohort is moderately older and more affluent (in part because of the exclusion of returning students), and disproportionately female compared to the general Centre County population.
Prior positive results and seroprevalence. Of the returning student participants, 673 (92.8%) had at least one test prior to enrollment in the study; of these, 107 (15.9%) self-reported a positive result (Table 3). Of these, 100 (93.5%) indicated that this test result occurred after their return to campus (median: 25 September; IQR: 10 September, 07 October). Of the 684 returning students with an ELISA result, 95 of the 102 (93.1%) with a self-reported prior positive test result were positive for SARS-CoV-2 IgG antibodies. Of the 582 returning students with ELISA results who did not report a positive SARS-CoV-2 test, 113 (16.5%) were positive for SARS-CoV-2 IgG antibodies. Of the total 684 returning students with ELISA results, 208 (30.41%) were positive for SARS-CoV-2 IgG antibodies (Fig. 2). Among the community resident participants, 42 of 1313 (3.2%) were positive for SARS-CoV-2 antibodies at their first visit (Fig. 2). Between their first and second visit, 54 participants converted from negative to positive and 19 converted from positive to negative; 96 (7.3%) were positive for SARS-CoV-2 IgG antibodies at either visit. There were no differences by age or the number of days separating visit samples, between those that seroconverted and seroreverted (p = 0.91; p = 0.91, respectively). The Wave 1 quantitative OD values of those who seroreverted (n = 19) were significantly lower than individuals who remained positive from waves 1 to 2 (n = 23) (Welch's t-test, p = 0.001; mean of 0.32 vs 0.63). Community residents who were of similar age and household income as the returning students (age ≤ 30 years and income ≤ 50 k USD) did not have significantly different seroprevalence than community residents age > 30 years or with income > 50 k USD (Supplemental Tables 3, 4, 5).
Of returning students with a self-reported prior positive SARS-CoV-2 test, 93.1% (95% CI 86.4-97.2%) had positive IgG antibodies; this was used as an estimate of sensitivity of the IgG assay for detecting previously detectable infection (see Supplement for an alternative calculation of sensitivity that includes community resident www.nature.com/scientificreports/ responses). For all values of specificity below 0.95, the 95% credible intervals for the prevalence in the community residents overlapped for the pre-and post-term time points, and neither overlapped with the returning student subgroup (Fig. 3).
Variables associated with IgG positivity. Among the returning students, only close proximity to a known SARS-CoV-2 positive individual (aOR 3.1, 2.07-4.64) and attending small gatherings in the past 3 months (aOR 1.52, 1.03-2.24) were significantly associated with a positive ELISA classification in the multivariable model (Table 4). Attending medium gatherings (51-1000 people) (OR 1.78, 1.17-2.69), and close proximity to an individual showing key COVID-19 symptoms (OR 1.67, 1.18-2.36) were also associated with the IgG positivity in crude calculations of association. Among the community cohort, the amount of student contact was not associated with cumulative IgG positivity. However, PSU employees experienced reduced odds of positivity (OR 0.56, 0.35-0.9). Neither AIC or BIC were improved by the addition of student contact as a variable over employment status only, or using student contact as the only variable. Both the returning students and community residents self-reported high masking compliance; 86.7% and 75.9%, respectively, reported always wearing mask or cloth face covering when in public (Tables S1, S2). Less than one third of both groups (28.9% and 29.8%, respectively) self-reported always maintaining 6-feet of distance from others in public. Less than half (42.8%) of returning students indicated that they always avoided groups of 25 or greater, in contrast with 65.7% of community residents.

Discussion
The return of students to in-person instruction on the PSU UP campus was associated with a large increase in COVID-19 incidence in the county, evidenced by over 4500 student cases at PSU 18 . In a sample of 684 returning students, 30.4% were positive for SARS-CoV-2 antibodies. Out of approximately 35,000 students who returned to campus, this implies that the detected cases may account for ~ 40% of all infections among PSU UP students. Despite this high overall incidence of SARS-CoV-2 infection in the county during the Fall 2020 term, the studied cohort of community residents (who disproportionately identified as female and lived in close proximity to campus) saw only a modest increase in the prevalence of SARS-CoV-2 IgG antibodies (3.2 to 7.3%) between September and December 2020; consistent with a nation-wide estimate of seroprevalence for the summer of 2020 24 . The true prevalence of prior SARS-CoV-2 infection in the cohorts depends on the assumed sensitivity and specificity. However, for most realistic values of sensitivity and specificity, there was little evidence of a significant increase among the community resident sample. Within the community cohort, 19 individuals seroreverted. Given the high specificity of the ELISA, the probability of observing 19 or greater false positives is < 0.0001, so it is possible that this reflects waning immunity. We note that these 19 individuals had lower OD values in wave 1 than those that remained positive from wave 1 to wave 2, which is consistent with waning from an initially low antibody titer. While in-person student instruction has been associated with an increase in per-capita COVID-19 incidence 12 , these results suggest that outbreaks in the returning student and the community resident cohorts we studied were asynchronous, implying limited between-cohort transmission. A recent analysis of age-specific movement and transmission patterns in the US suggested that individuals between the ages of 20-34 disproportionately contributed to spread of SARS-CoV-2 34 . Despite close geographic proximity to a college-aged population, transmission in our community resident sample appears distinctly lagged; suggestive of the potential for health behaviors to prevent infection.
Within the student group, presence of SARS-CoV-2 antibodies was significantly associated with close proximity to known SARS-CoV-2-positive individuals and attendance of small events. No other risk factors were correlated with an increase in IgG test positivity, aligning with other research 24 . It is not possible to discern how much the likelihood of contact with a SARS-CoV-2 positive individual is due to the high campus prevalence versus individual behaviors. Considered independently, eating in dining halls within the past 7 days was weakly associated with testing positive for SARS-CoV-2 antibodies, and participation in medium-sized events (51-1000 individuals) and close proximity to a symptomatic individual were significantly associated with testing positive www.nature.com/scientificreports/ for SARS-CoV-2 antibodies, which is consistent with patterns observed elsewhere 30,31 . Within the community group, being a PSU employee was significantly associated with lower odds of IgG test positivity. There were no significant differences in the age distributions of by employment status. Bharti et al. 35 identified lower per-capita incidence in Centre County residents relative to the 5 surrounding counties, as well as a greater movement restriction and less time spent outside the home. Whilst this paper only examined Centre County residents, it is plausible that PSU employees were more able to work remotely and similarly reduced their movement and non-household contacts, relative to non-PSU employees. The low number of positive community cases meant that it was not possible to identify other associations with IgG positivity. Though the participants reflect a convenience sample, the large differences in SARS-CoV-2 seroprevalence suggest that the cohorts did not experience a synchronous, well-mixed epidemic despite their close geographic proximity. College campuses have been observed to have high COVID-19 attack rates, and counties containing colleges and universities have been observed to have significantly higher COVID-19 incidence than demographically matched counties without such institutions 12 . While college and university operations may present a significant exposure risk, this analysis suggests the possibility that local-scale heterogeneity in mixing may allow for asynchronous transmission dynamics despite close geographic proximity. Thus, the disproportionately high incidence in the student population, which comprises less than one quarter of the county population, may bias assessment of risk in the non-student population. Risk assessment in spatial units (e.g., counties) that have strong population sub-structuring should consider these heterogeneities and their consequences to inform policy.
While SARS-CoV-2 transmission between the student and community resident populations is likely to have occurred (perhaps multiple times), the large difference in seroprevalence between the student and resident participants after the Fall term are consistent with either rare or non-persistent transmission events between the students and residents, or both. This suggests that it is possible to minimize risks brought about by sub-populations with high SARS-CoV-2 incidence using behavioral interventions. This observation may have implications for outbreak management in other high risk, highly mobile populations (e.g., displaced populations, seasonal workers, military deployment). However, we note that this was achieved in the context of disproportionate investment in prevention education, testing, contact tracing, and infrastructure for isolation and quarantine by PSU in the high-prevalence sub-population (students).  Figure 3. Estimated true prevalence (circles, with 95% confidence intervals) among participants at each sampling interval corrected for estimated assay sensitivity as a function of the assumed assay specificity (x-axis). Light blue indicates community residents at the first visit at the start of the Fall 2020 term, red indicates returning students at the end of the Fall 2020 term, and dark blue indicates community residents at the second visit after student departure. www.nature.com/scientificreports/ With respect to the health behaviors measured, both students and community residents reported high masking rates (> 75%) and low distancing rates in public (< 30%). However, students had significantly higher masking and gathering rates than community residents, thus a next step is to identify factors that may explain these differences. Minimizing risk, however, may come at significant social, psychological, educational, economic, and societal costs 36 . Thus, operational planning for both institutions of higher education and their resident communities should consider both the risk of SARS-CoV-2 transmission and the costs of mitigation efforts.

Limitations and strengths.
Neither the resident nor the student participants were selected using a probability-based sample. Thus, these participants may not be representative of the populations. Those who chose to participate in this study may have been more cognizant and compliant with public health mitigation measures. Specifically, the resident participants disproportionately lived in the townships immediately surrounding the UP campus, where extensive health messaging 37 and preventative campaigns were enacted, and they have a higher median income than the residents of Centre County overall.
Serotype analysis was not performed, so it may be possible that each sampling time-point reflects the dynamics of different (previous) Variants of Concern (VOCs). However, most samples were provided before VOCs were identified within the United States; Alpha (B.1.1.7) was first identified in Colorado on December 29, 2020, halfway through community wave 2, and Beta (B.1.351) was first identified in South Carolina on January 28, 2021, a few days before the completion of community wave 2 sampling 38 .
To our knowledge, this is one of the first studies to explicitly examine the effects of a large and transient student population on the SARS-CoV-2 prevalence of a geographically proximate community population using a longitudinal cohort design. Other studies have observed this influence using a cross-sectional or matched case-control design, but here we present the results of a time-ordered study with large cohort sizes.

Data availability
Callum Arnold and Dr. Matthew J. Ferrari had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Callum Arnold, Dr. Matthew J. Ferrari (Department of Biology, Pennsylvania State University), and Dr. Catherine M. Herzog (Huck Institutes of the Life Sciences, Table 4. Crude and adjusted odds ratios (OR; aOR) of risk factors among returning PSU UP student cohort.