Insights into household transmission of SARS-CoV-2 from a population-based serological survey

Understanding the risk of infection from household- and community-exposures and the transmissibility of asymptomatic infections is critical to SARS-CoV-2 control. Limited previous evidence is based primarily on virologic testing, which disproportionately misses mild and asymptomatic infections. Serologic measures are more likely to capture all previously infected individuals. We apply household transmission models to data from a cross-sectional, household-based population serosurvey of 4,534 people ≥5 years from 2,267 households enrolled April-June 2020 in Geneva, Switzerland. We found that the risk of infection from exposure to a single infected household member aged ≥5 years (17.3%,13.7-21.7) was more than three-times that of extra-household exposures over the first pandemic wave (5.1%,4.5-5.8). Young children had a lower risk of infection from household members. Working-age adults had the highest extra-household infection risk. Seropositive asymptomatic household members had 69.4% lower odds (95%CrI,31.8-88.8%) of infecting another household member compared to those reporting symptoms, accounting for 14.5% (95%CrI, 7.2-22.7%) of all household infections.

H ousehold-centered studies provide an enumerable set of individuals known to be exposed to an infectious person, hence, they have played an important role for estimating key transmission properties of SARS-CoV-2. However, most published studies of SARS-CoV-2 household transmission rely on clinical disease (COVID- 19), and/or PCR-based viral detection to identify infected individuals 1,2 . Due to the narrow time window after exposure in which RT-PCR can be highly sensitive 3 , case ascertainment based on virologic testing may miss infections, especially those that are mild or asymptomatic 4 . This can lead to important biases and limit what can be studied, including underestimates of the importance of sub-clinical infections and household secondary attack rates 4 .
Serologic studies provide an alternative tool for understanding SARS-CoV-2 transmission. Serological tests remain sensitive to detecting past infections well beyond the period when the virus is detectable [5][6][7] , thereby providing a measure of whether individuals have ever been infected.
Virologic and serologic studies have each provided important insights into SARS-CoV-2 transmission. These include estimates of the household secondary attack rate (e.g., 17% in a metaanalysis 2 ) and evidence of reduced infection rates among young children 2,8,9 . However, in general, these estimates do not distinguish between intra-and extra-household transmission nor do they provide an estimate of transmission risk from a single infected individual. A notable exception is a household study from Guangzhou, China 10 , but this PCR-based study suffered from the limitations of virologic testing noted above. Hence, a number of critical gaps in the evidence remain, including the relative role of transmission between household members, the frequency of viral introductions into households from the community, the infectiousness of asymptomatic individuals, and the effect of age on transmission.
To help fill these gaps, we apply household transmission models to data from a cross-sectional, household-based population serosurvey of 4534 people from 2267 households in Geneva, Switzerland (SEROCoV-PoP). We provide a serology-based assessment of transmission between intra-and extra-household contacts, identify risk factors for infection and transmission and estimate the relative risk of asymptomatic transmission. By doing so, we provide important evidence for guiding the COVID-19 pandemic response.

Results
Between April 3rd and June 30th, during the first wave of the SARS-CoV-2 pandemic in Geneva, 8344 individuals coming from 4393 households were successfully enrolled in the SEROCoV-POP study (Figs. 1 and S1) 11 . The median enrollment date was May 22nd, 86 days after the first case was detected in Geneva (February 26th, 2020). In 2267 of these households, all members of the household were eligible, available, and provided a blood sample for detection of anti-SARS-CoV-2 IgG antibodies by ELISA (4354 individuals). The majority of these households were either one (37.9%, n=860) or two (39.2%, n=889) person households (Fig. S2, Table S1). The median household size in our study (2.0, interquartile range [IQR]=1,2) was similar to the general population in Geneva canton (median=2.0, IQR=1,3) 12 .
The median age of participants was 53 years (IQR=34,65), and 53.6% were female. Compared with the general canton population, our study sample included more individuals 50 years and older and fewer 20-49 year olds. Individuals in older age groups were more likely to live in smaller households: 94.6% (1100/1163) of people who were 65 years and older lived alone or in twoperson households versus 44.5% (588/1302) of those 20-49 years old (Table 1). Our study sample, like that of the original SEROCoV-POP study, had a higher level of formal education than the general canton population with only 8.5% not having a high school degree or equivalent, compared with 23.5% in the general canton population (Table S5) 13 .
Overall, 6.6% (298/4534) of individuals tested positive for SARS-CoV-2anti-S1 IgG antibodies by ELISA. Of the 2267 households included in the analyses, 222 (9.8%) had at least one seropositive household member. The proportion of households with seropositive members increased from 4.8% (41/860) in households of size one, to 17.0% (39/229) in households of size three, and was relatively constant in larger households (Fig. S2,  Table 1, Fig. S3). Symptoms consistent with COVID-19 were reported by 69.5% (207/298) of seropositive individuals although this was significantly lower in young children (37.5%, 3/8), similar to the results of an early modeling study 14 .
We fit household transmission models and estimated that from the start of the epidemic in Geneva through the time of the serosurvey, the cumulative risk of infection from extra-household exposures was 5.1% (95% Credible Interval [CrI] 4.5-5.8%). The probability of being infected from a single infected household member was 17.3% (95% CrI 13.7-21.7%, Fig. 2).
The risk of being infected by a household member was the lowest among 5-9 years old and highest among those 65 years and older, with teenagers and working age adults sharing similar risks (Figs. 2, 3). Compared to 20-49 years olds, 5-9 years olds had less than half the odds of being infected by an infected household member (OR=0.4, 95%CrI 0.1-1.6), while those 65 years and older had nearly three times the odds (OR=2.7, 95%CrI 0.9-7.9). Though credible intervals on these estimates are wide, and both include the null value of 1, inclusion of age substantially improved model fit (ΔWAIC −14.8, Table S2). In contrast, the extra-household infection risk was the highest among working age adults (20-49 years olds). Compared to this group, 5-9 year olds (OR=0.5, 95%CrI 0.2-0.9) and those 65 years and older (OR=0.4, 95%CrI 0.3-0.6) had the lowest risk (Fig. 3, Tables S2 and S4). Models allowing for differential risk of transmission by  the age of the infector were not well supported by the data (ΔWAIC −15.5 to −24.7) and included no significant differences between ages (Table S2).
Seropositive household members not reporting symptoms had 0.31 times the odds (95%CrI: 0.11-0.68) of infecting another household member compared to those reporting symptoms consistent with COVID-19 (Fig. 3). This difference was larger (OR=0.24, 95%CrI 0.09-0.54) when only considering those who reported symptoms more than two weeks before blood draw as symptomatic infections (Table S6, Fig. S6). Here we focus on the results of the best fitting models, but across the ten models considered (Table S2), estimates were qualitatively and quantitatively consistent with the primary findings. Similarly, we explored the sensitivity of our results to the ELISA seropositivity cutoff and found no qualitative differences in results (Fig. S4).

Discussion
The results presented here appropriately place symptomatic household transmission of SARS-CoV-2 in the context of community risk and asymptomatic spread. We show an approximate 1 in 6 risk (17.3%) of being infected by a single SARS-CoV-2 infected household member (Table S3). This contrasts with a 1 in 20 chance (5.1%) of being infected in the community over most of the first epidemic wave in Geneva, a period of roughly 2 months. Despite the high risk of transmission from an infected household member, as in many cities in high-income nations, households are mostly small limiting opportunities for onward transmission. Thus, less than a quarter of cases could be attributed to transmission between household members. While asymptomatic individuals appear to be less than a third as likely to transmit, they cannot be dismissed as inconsequential to disease spread, and are responsible for one in six within-household transmissions in this study. Our results are suggestive of the dual roles of biology and social behavior in shaping age-specific infection patterns, with the age signature of risk within households indicative of lower biological susceptibility in the very young, and elevated susceptibility in the old; while extra-household risk seems more driven by behavior, with working age adults being at the highest risk.
It has long been thought that asymptomatic individuals are less likely to transmit than symptomatic ones, though studies have recovered similar concentrations of viral RNA from nasopharyngeal samples from these two groups 15 . By using serological data, we were able to show that those not reporting symptoms have one-third the odds of transmitting within households as symptomatic ones, similar to a study from Wuhan, China 16 , and ultimately caused about 15% of household infections. This reduced transmissibility may be due to reduced duration of viral shedding and reduced ability to mechanically spread virions (e.g., through coughs). We did not assess the role of asymptomatics in community spread, but it is plausible that they may play an even larger role there, as symptomatic Fig. 2 Risk of extra-household transmission and within-household transmission from a single infected household member. a Estimated median probability of extra-household infection from the start of the epidemic in Geneva until the time of the serosurvey by age group and sex. b Estimated median probability of infection from a single infected household member by age group and sex. Dots and bars represent median and 95% credible intervals of the posterior distribution. Probabilities of being infected by sex and age group of the exposed individuals are estimated by a model only including age and sex of the exposed individuals (model 2, orange/green bars; see Table S2). Probabilities of being infected by the age group of the exposed individuals combining males and females (left four gray bars on both panels) are estimated with an age-only model (model 1). The overall probabilities of being infected (rightmost gray bar on both panels) are estimated with the null model (model 0).
individuals are more likely to stay home or take extra precautions to reduce exposures when sick.
As with previous studies of SARS-CoV-2 transmission among household members and other close contacts 2,17,18 , we find evidence supporting a reduced risk of infection from household exposures among young children, and elevated risk of infection among those 65 or older. However, it is important to note that we only find this reduced risk among the youngest children in our study (5-9-year-olds), while 10-19 year olds have a similar risk profile to working age adults. The other PCR-based household study that reported per-exposure transmission did not report susceptibility results from this age group 10 . This is consistent with the hypothesis that young children may be biologically less susceptible to SARS-CoV-2 infection, though heterogeneity in social contact and other behaviors within households cannot be ruled out.
Patterns of extra-household infection suggest social factors dominate this risk, as both young children and older adults are at reduced risk of infection compared to working age adults. As children have returned to schools in Geneva (mid-May 2020), the social factors driving this pattern have likely changed significantly and we may see children become a more significant source of extra-household infections 19 , despite their apparently lower susceptibility. The risk that infected young children pose to their household members is unclear; the sample size was likely too low to detect small to moderate differences in risk. While there are mixed results in the literature on age-specific differences in infectiousness 20 , a large study from Wuhan, China suggested that those less than 20 years old are more likely to infect others than adults 60 years and older, given the same amount of exposure 16 .
We did not find any significant relationship between the age of an infector and probability of transmission (nor did including these terms improve model fit), but children are less often symptomatic 21 and we did find a strong relationship between symptoms and transmission.
Our study has a number of important limitations. Symptoms were self-reported and, given that the times of infection are unknown, they may not necessarily have been a result of the SARS-CoV-2 infection. We cannot exclude recall bias in symptom reports and other self-reported exposures. Further, we looked at only a narrow range of symptoms to increase specificity, which left out more general potentially SARS-CoV-2-related symptoms (e.g., nausea, diarrhea). We detected only eight seropositive children under the age of 10, leading to large uncertainty in agespecific risk estimates for this group. Although extra-household estimates are informed by data from all households, withinhousehold estimates are only informed by data from households with at least one seropositive member (222/2267 households), thus limiting our statistical power. While validation data of the Euroimmun ELISA from across the world have confirmed its high specificity and sensitivity for detecting recent infections [22][23][24] , most data are from adults, and it is possible that performance in young children may be different. Most of the participants in the study were recruited after the epidemic peak and it is possible that we did not fully capture all infections in each household due to insufficient time to mount a detectable response. Conversely, we may have also missed infections due to waning of responses. However, antibody responses appear to generally sustain over the first 4 months from infection, the plausible infection time window of participants in this study 25 . When conducting stratified Relative odds of being infected outside the household and from a single infected household member by individual characteristics of the exposed individuals, a age group, b sex, and c potential infectors' symptom status. Odds ratios and credible intervals, shown on the log-scale, are estimates from model 4 (see Table S2).
analyses including households recruited early and late, we found few qualitative differences in the primary results (Figs. S5 and S7). We included only households where all household members provided blood samples in the main analysis, but sensitivity analyses of all enrolled individuals led to similar primary results (Table S6). Given the cross-sectional nature of our data, all transmission chains within households were equally likely within our modeling framework, which led to larger uncertainty than having prospectively collected data. However, collection of these data over thousands of households can be challenging, and we show that more commonly collected data from serosurveys can be leveraged to refine our understanding of transmission.
This study captures infections that occurred during the first wave of the pandemic in Geneva, a period of time when workplaces and schools were largely closed and peoples' social contacts were greatly reduced. In future phases of this pandemic, when social contact patterns change the proportion of transmission that occurs between household members and potentially age-sex specific risks could differ. While we found no evidence in previous analyses of these data for differences in seropositivity by neighborhood wealth or education 26 , these and other indicators of wealth might be associated with transmission risk within Geneva or in other populations. Likewise, the general nature of the Geneva population and the control measures in place may limit the generalizability of our estimates of absolute risk of infection, attributable fraction, and extra-household risks. For example, the increasing importance of household transmission with increasing household size (Fig. S8) suggests household transmission would be far more important in settings with larger households. However, we believe our estimates of relative risks by age and symptom status within households, which are likely more biologically driven, should be generalizable to most settings; as should our general observations about how social and biological factors influence different types of transmission.
Our study highlights how biological and social factors might combine to shape the risk of SARS-CoV-2 infection. While we expect some differences across settings, we believe that the general trend in per-exposure infection risk by age and sex and increased infectiousness of symptomatic individuals are fundamental attributes of this pandemic. These differences have important implications for guiding patient care and public health policy. For example, increased susceptibility of the oldest individuals suggests that rapid and aggressive measures are needed to protect them as soon as there is any possibility that SARS-CoV-2 was introduced into their living environment. At the population level, quantifying the infectiousness of asymptomatics can help us understand the extent the pandemic is driven by asymptomatic infections. Our study provides a model for using cross-sectional serologic surveys to assess the relative contribution of household and community transmission. As countries continue to alter quarantine and self-isolation policies, disentangling the contribution of household and community transmission can help evaluate success of these intervention strategies. Continued serological and virologic monitoring of diverse populations with detailed analyses like those presented here are critical to the continued evidence-based response to this pandemic.

Methods
Study design, participants, and procedures. The SEROCoV-POP study is a cross-sectional population-based survey of former participants of an annual survey of individuals 20-74 years old representative of the population of Geneva (Canton), Switzerland. The enrollment into the study occurred from April through June 2020 during the first wave of the SARS-CoV-2 pandemic in Geneva. First wave lockdown measures (including school closures) started in mid-March and largely ended by the end of May. The full survey protocol is available online and a detailed description of the design and seroprevalence results were previously published 11,26 .
The SEROCoV-POP study invited all 10,587 participants of the previous annual surveys to participate in the study through email or post. Participants were invited to bring all members of their household aged 5 years and older to join the study. After providing informed written consent, participants either filled out a questionnaire online, in the days before their visit, or at the time of their visit at one of two enrollment locations (the main canton hospital and one satellite location) within Geneva. The questionnaire included questions about participants' demographics, household composition, symptoms since January 2020, details on the frequency of extra-household contacts and reduction in social interaction since the start of the pandemic. Only participants 14 years and older were asked about their frequency of extrahousehold contacts and changes in behavior. Despite this age cut off, we use more standard age cutoffs (10-19 years) in our analysis for comparability with other studies 11 . We defined symptom presentation a priori as having reported any of: cough, fever, shortness of breath, or loss of smell or taste since January 2020 (symptoms reported in the 2-week prior to testing were excluded in a sensitivity analysis). We collected peripheral venous blood from each consenting participant. Households where all members provided blood samples were included in the present analysis (there was a 100% questionnaire response rate in this group). As blood was not collected from children under 5, all households with children in this age group were excluded. We conducted a sensitivity analysis with all households, regardless of whether all members provided blood samples, effectively treating household members outside the study as a community source of infection. All participants gave written informed consent before participation in the SEROCoV-POP study. For individuals younger than 18 years, parents or a legal representative provided consent. The study was approved by the Cantonal Research Ethics Commission of Geneva, Switzerland (CER16-363).
Laboratory analysis. We assessed anti-SARS-CoV-2 IgG antibodies in each participant using an ELISA (Euroimmun; Lübeck, Germany #EI 2606-9601 G) targeting the S1 domain of the spike protein of SARS-CoV-2; sera diluted 1:101 were processed on a EuroLabWorkstation ELISA (Euroimmun). An in-house validation study found that the manufacturer's recommended cutoff for positivity ( ≥ 1.1) had a specificity of 99% and sensitivity of 93%, based on positive controls tested between 0 and 39 days after symptom onset 24 . In our primary analyses we defined seropositivity based on the cutoff recommended by the manufacturer and explored a higher cut-off of 1.5 (>1.5) in sensitivity analyses 24 . As the presence of antibodies has been shown to be a reliable marker of past infection, we use the term "infected" to refer to a seropositive individual.
Statistical analyses. We fit chain binomial transmission models to estimate two primary quantities; the average probability of extrahousehold infection from the start of the epidemic through the time of blood draw across Geneva (referred to also as "community infections" over the first epidemic wave) and the probability of being infected from a single infected household member over the course of his/her infectious period (referred to as "household exposures"; see supplemental text for model assumptions) 27,28 . We assume that serologic status is a perfect marker of having been infected, that individuals cannot get reinfected, and that all individuals were susceptible at the start of the pandemic. When fitting these models we explicitly consider all possible sequences of viral introductions to each household and subsequent transmission events within the household. For example, in a household with 2 seropositive individuals, both could have been infected outside of the household, or one could have been infected outside and then infected one other person within the household. We adapted models to estimate the within household and extrahousehold transmission risk according to the characteristics of potential infectees (age, sex, self-reported extra-household contact behavior) and, for within-household risk, those of the potential infectors (symptoms, age). As extra-household contact questions were only asked to those over 14 years old, we compared extrahousehold transmission by self-reported reduction or frequency in social contacts only for those 20 years and older. We imputed a small number of missing data (1%, 36/3908) related to extrahousehold contacts among those who were 20 years and older based on household averages (see supplement). We simulate the proportion of infections attributable to extra-household and within household exposures.
We built a series of ten models including different combinations of individual-level characteristics (e.g., age, sex, self-reported contacts, symptoms) and compared their fit using the widely applicable information criterion (WAIC) 29 . We implemented the models in the Stan probabilistic programming language and used the rstan package (version 2.21.0) to sample from the posterior distribution and analyse outputs 30 . We used weakly informative priors on all parameters to be normally distributed on the logit scale with mean of 0 and standard error of 1.5. We ran four chains of 1,000 iterations each with 250 warm-up iterations and assessed convergence visually and using the Gelman-Rubin Convergence Statistic (R-hat) 31 . All estimates are means of the posterior samples with the 2.5th and 97.5th percentiles of this distribution reported as the 95% credible interval. Full model and inference details are provided in the supplement and code needed to reproduce analyses are available at https://github.com/ HopkinsIDD/serocovpop-households (https://doi.org/10.5281/ze nodo.4740044).
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
Data can be made available to share upon submission of a data request application to the investigators board via the corresponding author or S.S. (silvia.stringhini@hcuge.ch). Data needed for testing the code can be found at https://github.com/HopkinsIDD/ serocovpop-households (https://doi.org/10.5281/zenodo.4740044).