Increased mortality in community-tested cases of SARS-CoV-2 lineage B.1.1.7

SARS-CoV-2 lineage B.1.1.7, a variant first detected in the United Kingdom in September 20201, has spread to multiple countries worldwide. Several studies have established that B.1.1.7 is more transmissible than preexisting variants, but have not identified whether it leads to any change in disease severity2. We analyse a dataset linking 2,245,263 positive SARS-CoV-2 community tests and 17,452 COVID-19 deaths in England from 1 September 2020 to 14 February 2021. For 1,146,534 (51%) of these tests, the presence or absence of B.1.1.7 can be identified because of mutations in this lineage preventing PCR amplification of the spike gene target (S gene target failure, SGTF1). Based on 4,945 deaths with known SGTF status, we estimate that the hazard of death associated with SGTF is 55% (95% CI 39–72%) higher after adjustment for age, sex, ethnicity, deprivation, care home residence, local authority of residence and test date. This corresponds to the absolute risk of death for a 55–69-year-old male increasing from 0.6% to 0.9% (95% CI 0.8–1.0%) within 28 days after a positive test in the community. Correcting for misclassification of SGTF and missingness in SGTF status, we estimate a 61% (42–82%) higher hazard of death associated with B.1.1.7. Our analysis suggests that B.1.1.7 is not only more transmissible than preexisting SARS-CoV-2 variants, but may also cause more severe illness.

In those with SGTF status measured, SGTF prevalence was similar in males and females but lower in the older age groups: 59.0% in 1-34-year-olds compared with 55.4% in those aged 85 and older. In keeping with these age patterns, SGTF prevalence was lower in individuals living in a care or nursing home (54.3%) than those in residential accommodation (58.8%). SGTF prevalence by self-identified ethnicity was 58.0% in the White group, 57.6% in the Asian group, 69.6% in the Black group, and 64.8% in the other, mixed, or unknown ethnicity group. SGTF prevalence was lowest in the most deprived index of multiple deprivation 3 (IMD) quintile (53.9%). The highest prevalences of SGTF over the study period were observed in the East of England (77.5%), South East (77.3%) and London (75.4%) regions, and prevalence of SGTF was lowest in the North East and Yorkshire region (41.2%). The prevalence of SGTF increased steeply over time (Fig. 1a), from 5.8% during 1-21 November 2020 to 94.3% during 24 January-14 February 2021.
Missing SGTF status was strongly associated with age and place of residence. The proportion with SGTF status missing was similar in age groups 1-34 (48.3%), 35-54 (47.8%) and 55-69 (48.2%), and then rose to 54.4% in the 70-84 age group and to 77.7% in the 85 and older age group. SGTF status was missing in 87.9% of tests for individuals living in a care or nursing home, compared to 47.4% of tests among individuals in residential accommodation. This is partly due to more extensive use of lateral flow immunoassay tests in care homes, which do not yield an SGTF reading. Missingness in SGTF status also differed substantially by NHS England region, ranging from 21.2% in the North West to 71.1% in the South West, which is largely explained by proximity to a Lighthouse lab capable of producing an SGTF reading (Extended Data Fig. 1). Missingness also depended on specimen date, with the percentage missing being lower for the earlier specimen dates and highest (54.4%) in the 21-day period that contributed the most tests . There were also minor differences in missingness by ethnicity and IMD. Of the 48.9% of tests with missing SGTF status, 5.1% were inconclusive due to high Ct values and the remaining 43.8% were not assessed for SGTF.
19,615 people in the study sample are known to have died (0.87% of 2,245,263). Crude death rates were substantially higher in the elderly and in those living in a care or nursing home (Supplementary Table 1). The standard definition of a COVID-19 death in England is any death occurring within 28 days of an individual's first positive SARS-CoV-2 test; 17,452 of the observed deaths (89.0%) met this criterion (Fig. 1b). Among those with known SGTF status, the crude COVID-19 death rate was 1.86 deaths per 10,000 person-days of follow-up in the SGTF group, versus 1.42 deaths per 10,000 person-days in the non-SGTF group ( Fig. 1c; Extended Data Table 2). Stratifying by broad age groups and by sex, place of residence, ethnicity, IMD, region, and specimen date, death rates within 28 days of a positive SARS-CoV-2 test were higher among SGTF than non-SGTF cases in 98 of the 104 strata assessed (94%; Figs. 1d-i; see also Supplementary Table 2).

Fig. 1. Descriptive analyses. a
The number of samples with and without SGTF by day from 1 November 2020 to 14 February 2021, the period covered by our main analysis. b Number of deaths within 28 days of positive test by specimen date included in the analysis. c Kaplan-Meier plot showing survival (95% confidence intervals) among individuals tested in the community in England with and without SGTF, in the subset with SGTF measured. Inset shows the full y-axis range. d-i Crude death rates (point estimates and 95% CIs) among SGTF versus non-SGTF cases (in the subset with SGTF measured, n = 1,146,534) for deaths within 28 days of positive test stratified by broad age groups and (d) sex, (e) place of residence, (f) ethnicity, (g) index of multiple deprivation, (h) NHS England region, and (i) specimen date. Horizontal bars show the overall crude death rates (95% CIs) by age group irrespective of SGTF status.

Cox regression analyses
To estimate the effect of SGTF on mortality while controlling for observed confounding (Extended Data Fig. 2), we fitted a series of Cox proportional hazards models 4 to the data. We stratified the analysis by lower tier local authority (LTLA) and specimen date to control for geographical and temporal differences in the baseline hazard-for example, due to changes in hospital pressure during the study period-and used spline terms for age and IMD and fixed effects for sex, ethnicity, and residence type. All models were fitted twice, once using complete cases only, i.e. by simply excluding individuals with missing SGTF status, and once using inverse probability weighting (IPW), i.e. accounting for missingness by upweighting individuals whose characteristics-age, sex, IMD, ethnicity, residence type, NHS England region of  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) residence and sampling week-are underrepresented among complete cases. This analysis assumes that, holding these characteristics constant, whether an individual dies is independent of missingness in SGTF status 5 .
For the complete-cases analysis, the estimated hazard ratio for SGTF was 1.55 (95% CI 1.39-1.72), indicating that the hazard of death in the 28 days following a positive test is 55% (39-72%) higher for SGTF than for non-SGTF cases.
To assess the model assumption of proportional hazards, we added an interaction term between SGTF and time since positive test. There was strong evidence of non-proportionality of hazards (likelihood ratio test ( ! " = 11) = 0.009; Fig. 2a; Extended Data Fig. 3), with the estimated time-varying hazard ratio increasing over time: 1.14 (0.92-1.40) one day after the positive test, 1.58 (1.42-1.75) on day 14, and 2.24 (1.75-2.87) on day 28. Adding higher-order functions of time into the interaction terms did not significantly improve model fit (likelihood ratio test ( ! " = 3.3) = 0.07). We found no evidence that the effect of SGTF varied by age group ethnicity ( ( % " = 1.2) = 0.75), or residence type ( ( " " = 0.33) = 0.85). We note, however, that the relatively small number of deaths among 1-34-year-olds over the study period (44 deaths) does not permit robust assessment of the impact of SGTF in this age group. Other timecovariate interactions suggested that the delay from positive test to death was slightly shorter among females, care home residents, and the elderly; see Supplementary Note 1 for more details on models with interaction terms.
For IPW analysis, a model to predict missingness is required. We evaluated a series of such models, including a cauchit model, which is a robust alternative to logistic regression suitable for IPW 5 . We selected the cauchit model as it fit well and resulted in less extreme weights than other models (Extended Data Fig. 4). The IPW analysis gave similar results to the completecases analysis, yielding a hazard ratio of 1.58 (1.40-1.78). Like the complete-cases analysis, the IPW analysis recovered an increasing hazard with time since positive test, but the increase was less marked (Fig. 2b) and did not significantly differ from zero (Wald test ( ! " = 1.4) = 0.23).

Misclassification analysis
Prior to the emergence of B.  Fig. 5). Again restricting the analysis to specimens from 1 November 2020 onward, we find a hazard ratio associated with pVOC of 1.58 (1.42-1.76) for the completecases analysis and 1.61 (1.42-1.82) for the IPW analysis ( Fig. 2c-d).     is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; Table 1. Absolute 28-day mortality risk associated with B.1.1.7, as expressed by case fatality ratio (%) among individuals testing positive in the community. The baseline risk (i.e., for preexisting SARS-CoV-2 variants) is derived using linked deaths within 28 days for all individuals testing positive in the community from 1 August -31 October 2020. Adjusted risks are presented for the SGTF analysis for complete cases and for the misclassification-adjusted (pVOC) IPW analysis, which yielded the lowest and highest mortality estimates, respectively, of the main models assessed (Fig. 2a-

Absolute risks
To put these results into context, we calculated absolute mortality risks by applying hazard ratios for SGTF to the baseline risk of death among individuals tested in the community between August-October 2020 (assumed to be representative of the CFR associated with preexisting variants of SARS-CoV-2; Table 1). For the complete-cases analysis, in females aged 70-84, the estimated risk of death within 28 days of a positive SARS-CoV-2 test increases from 2.9% without SGTF to 4.4% with SGTF (95% CI 4.0-4.9%) and for females 85 or older increases from 13% to 19% (17-21%). For males aged 70-84 the risk of death within 28 days increases from 4.7% to 7.2% (6.4-7.9%) and for males 85 or older it increases from 17% to 25% (23-27%). Estimates based on the IPW analysis corrected for misclassification were marginally higher. These estimates reflect a substantial increase in absolute risk amongst older age groups, but the risk of COVID-19 death following a positive test in the community remains below 1% in most individuals younger than 70 years old. Note that these estimates capture the fatality ratio among people tested in the community, and are thus likely to be higher than the infection fatality ratio, as many infected individuals are never tested.

Further investigations
We conducted a number of sensitivity analyses to verify the robustness of our results. Our main results were largely insensitive to: restriction to death-certificate-confirmed COVID-19 deaths only; any follow-up time of 21 days or longer; coarseness of geographical and temporal stratification; use of linear versus spline terms for age and IMD; analysis start date; follow-up time-covariate interactions; removal of the 10-day death registration cutoff; and restriction of the analysis to individuals with a full 28-day follow-up period (Fig. 2e). Generally, the IPW analysis yielded marginally higher hazard ratios, with greater uncertainty. As a further sensitivity analysis, we adjusted for an indicator in Pillar 2 testing data for whether the subject was tested because of symptoms or due to asymptomatic screening. Although we caution that symptomatic screening status may lie on the causal pathway between SGTF status and death, we found that this adjustment had no effect on the relative hazard of SGTF (1.54 [1.39-1.71], complete-cases analysis).

Discussion
We previously found that B.1.1.7 is substantially more transmissible than preexisting SARS-CoV-2 variants, but could not robustly identify any associated change in disease severity using population-level analysis of early data 2 . This analysis of individual-level data, which controls for factors that could confound the association between B.1.1.7 infection and death, reveals an increase in COVID-19 mortality associated with lineage B.1.1.7. We stratify our analyses by test time and geographical location-mimicking matching on these variables-to account for changes in testing rates and changing pressures on hospital services over time and by region.
Our findings are consistent with earlier reports 7 by ourselves and other groups assessing the risk of death among individuals with SGTF. Crucially, our study is limited to individuals tested in the community. Indicators for B.1.1.7 infection are not currently available for most people who die from COVID-19 in England, as they are tested in the hospital rather than in the community and hospitals do not routinely collect genotypic data. However, this restricted focus allows us to capture the combined effect of an altered risk of hospitalisation given a positive test and an altered risk of death given hospitalisation, while only the latter would be measurable in a study of hospitalised patients only. Unfortunately, we were unable to account for vaccination status in this analysis.
We do not identify any mechanism for increased mortality here. B.1.1.7 infections are associated with higher viral concentrations on nasopharyngeal swabs, as measured by Ct values from PCR testing (Extended Data Fig. 6). Higher viral load could therefore be partly responsible for the observed increase in mortality; this could be assessed using a mediation analysis. Alternatively, changes in test-seeking behaviour could, in principle, explain our results. If B.1.1.7 infections were less likely to cause symptoms, but symptomatic B.1.1.7 cases were more severe, then our study could overestimate changes in the infection fatality rate. However, we find no clear difference in SGTF frequency among community tests relative to a random sample of SARS-CoV-2 infections in the population (Extended Data Fig. 7), suggesting that variant-associated changes in test-seeking propensity do not explain our findings.  (4,945 / 57,750) of the individuals who died from COVID-19 over the study period. This is explained by differing mortality rates among individuals who first test positive in a hospital compared to those who are tested in the community, as the former group are much more likely to have severe illness, as well as by missingness in the SGTF data.

Methods
There was a small amount of missing data for sex (n = 14, <0.01%), age (n = 171, <0.01%), and IMD and regional covariates (n = 3,817, 0.16%). There were no missing specimen dates. Individuals with missing age, sex, or geographical location were excluded. We also excluded individuals from the dataset whose age was recorded as zero, as there were 17,913 age-0 individuals compared to 10,132 age-1 individuals in the dataset, suggesting that many of these . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. age-0 individuals may have been miscoded. There was some missing data on ethnicity (n = 47,491, 2%) and we created a category that combines missing values with "Other" and "Mixed". Missing values for residence type (n = 63,905, 3%) were also combined with an "Other" category. The full data set used for the main analysis comprises 2,245,263 individuals, with SGTF status missing or inconclusive for 1,098,729 (48.9%). Missing data on the exposure is addressed in the analysis, described below.
We grouped residence types into three categories: Residential, which included the "Residential dwelling (including houses, flats, sheltered accommodation)" and "House in multiple occupancy (HMO)" groups; Care/Nursing home; and Other/Unknown, which included the "Medical facilities (including hospitals and hospices, and mental health)", "No fixed abode", "Other property classifications", "Overseas address", "Prisons, detention centres, secure units", "Residential institution (including residential education)", and "Undetermined" groups, as well as unspecified residence type. We grouped ethnicities into four categories according to the broad categories used in the 2011 UK Census: Asian, which included the "Bangladeshi (Asian or Asian British)", "Chinese (other ethnic group)", "Indian (Asian or Asian British)", "Pakistani (Asian or Asian British)", and "Any other Asian background" groups; Black, which included the "African (Black or Black British)", "Caribbean (Black or Black British)", and "Any other Black background" groups; White, which included the "British (White)", "Irish (White)", and "Any other White background" groups; and Other / Mixed / Unknown, which included the "Any other ethnic group", "White and Asian (Mixed)", "White and Black African (Mixed)", "White and Black Caribbean (Mixed)", "Any other Mixed background", and "Unknown" groups.
Statistical methods -There are several factors that we expect to be associated with both SGTF and with risk of death, thus confounding the association between SGTF and risk of death in those tested. Area of residence and specimen date were expected to be potentially strong confounders. Area of residence is expected to be strongly associated with SGTF status due to different virus variants circulating in different areas, and specimen date because the prevalence of SGTF is known to have greatly increased over time. Area of residence and specimen date are also expected to be associated with risk of death following a test, including due to differential pressure on hospital resources by area and time. The following variables were also identified as potential confounders: sex, age, place of residence (Residential, Care/Nursing home, or Other/Unknown), ethnicity (White, Asian, Black, or Other/Mixed/Unknown), index of multiple deprivation (IMD). The potential confounders are referred to collectively as the covariates. For descriptive analyses, age (in years) was categorised as 1-34, 35-54, 55-69, 70-84, or 85 and older. Descriptive analyses were performed. We tabulated the distribution of the covariates in the whole study sample, the association between each covariate and SGTF status in the subset with SGTF measured, and the association between each covariate and missing data in SGTF status (Extended Data Table 1). The subset with SGTF status measured are referred to as the complete cases. The unadjusted association between SGTF and mortality in the complete cases was assessed using a Kaplan-Meier plot (Fig. 1c), and Kaplan-Meier plots and crude 28day mortality rates are also presented separately according to categories of the covariates . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; (Extended Data Table 2, Extended Data Fig. 2). Crude overall mortality rates (i.e., not restricted to 28 days after a positive test) were obtained for the whole sample, by SGTF status in the complete cases, and in those with missing SGTF status, according to categories of each covariate (Supplementary Table 1). We also obtained mortality rates by SGTF status (in the complete cases) for categories of each covariate stratified by age group (Fig. 1d-i). Exact Poisson CIs are used for mortality rates, assuming constant rates.
Approximately 49% of individuals in the study sample are missing data on SGTF status, due to their test not being sent to one of the three laboratories utilising the Thermo Fisher TaqPath COVID-19 assay or the test being inconclusive. We performed complete cases analysis, restricted to the subset with SGTF status measured. This complete case analysis assumes that for each analysis, the missing data, in this case missing SGTF status, is independent from the outcome of interest, given the variables included in the models. This is a specific type of missing not at random (MNAR) assumption, as in particular it is allowed to depend on the underlying value of SGTF. We also performed an analysis of the complete cases using inverse probability weights 5 (IPW) to address the missing data on SGTF, under a missing at random (MAR) assumption. In the analysis, each individual with SGTF status measured is weighted by the inverse of their probability of having SGTF status measured based on their covariates. For the IPW, the missingness model estimated the probability of missingness using logistic regression with age (restricted cubic spline), sex, IMD decile (restricted cubic spline), ethnicity, residence type by asymptomatic screening indicator, and NHS region by specimen week as predictors. We also considered a cauchit and a Gosset link for the missingness model, including the same predictors, as this was expected to provide better stability for the weights 5 . The fit of the missingness model was assessed using a Q-Q plot (Extended Data Fig. 4), and Hosmer-Lemeshow and Hinkley tests were used to choose the most appropriate model.
Cox regression 4 was used to estimate the association between SGTF and the hazard of mortality, conditioning on the potential confounders listed above. The analyses described here were applied to the complete cases and using IPW. For IPW analyses, the standard errors (SEs) accounted for the weights, though the fact that the weights were estimated was not accounted for; this results in conservative SEs. The baseline hazard in the Cox model was stratified by both specimen date and LTLA, therefore finely controlling for these variables. The stratification gives a large number of strata matched by specimen date and LTLA. Only those strata that contain individuals who die and individuals who survive contribute to the analysis. The analysis is therefore similar to that which would be performed had we created a matched nested case-control sample. The remaining variables were included as covariates in the model (sex, age, place of residence, ethnicity, IMD decile). Age was included as a restricted cubic spline with 5 knots, and IMD decile was included as a restricted cubic spline with 3 knots. The time origin for the analysis was specimen date and we considered deaths up to 28 days after the specimen date. Individuals who did not die within 28 days were censored at the earlier of 28 days post specimen date and the administrative censoring date, which we chose as the date of the most recent death linkable to SGTF status minus 10 days (i.e., 14 February 2021) in order to minimise any potential bias due to late reporting of deaths. We began by assuming proportionality of hazards for SGTF and the covariates included in the model. The proportional . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; hazards assumption was assessed by including in the model an interaction between each covariate and time, which was performed separately for SGTF and for each other covariate. Schoenfeld residual plots were also obtained for each covariate (Extended Data Fig. 3). We assessed whether the association between SGTF and the hazard was modified by age, sex, IMD, ethnicity, and place of residence. Models with and without interactions were compared using likelihood ratio tests for the complete cases analyses. For the analysis using IPW we used Wald tests based on robust standard errors 9 .
The analysis assumes that censoring is uninformative, which is plausible as all censoring is administrative.
Misclassification analysis -The exposure of SGTF is subject to misclassification, because a number of minor circulating SARS-CoV-2 lineages in addition to B.1.1.7 are also associated with failure to amplify the spike gene target. Accordingly, a positive test with SGTF is not necessarily indicative of infection with B.1.1.7. A negative test of SGTF is assumed to be indicative of absence of infection with B.1.1.7. Misclassification of an exposure can result in bias in its estimated association with the outcome. We fitted a logistic model to Pillar 2 SGTF frequencies by NHS region to estimate a "background" rate of SGTF in the absence of B.1.1.7, assuming a beta binomial prior. This model is then used to estimate the probability that an individual testing positive with SGTF is infected with B.1.1.7, separately for individuals in each NHS region. These probabilities can then be used in place of the indicator of SGTF exposure in the Cox models. This is the regression calibration approach 6 to correcting for bias due to measurement error in an exposure.
We fitted models accounting for false positives (modelled as regionally-varying background rates of SGTF associated with non-B.1.1.7 variants) to the SGTF data. Our logistic model for B.1.1.7 growth over time is as follows: Here, f(t) is the predicted frequency of B.1.1.7 among positive tests at time t (in days since 1 September 2020) based on the terms slope and intercept; s(t) is the predicted frequency of S gene target failure at time t due to the combination of B.1.1.7 and a background false positive rate falsepos, conc is the "concentration" parameter (= α + β) of a beta distribution with mode s(t); kt is the number of S gene target failures detected at time t; and nt is the total number of tests at time t. All priors above are chosen to be vague, and the truncation of conc to values greater than 2 ensures a unimodal distribution for the proportion of tests that are SGTF. The model above is fitted separately for each NHS England region. Then, pVOC for a test with SGTF . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; = 1 at time t is equal to ( )/ ( ), and pVOC = 0 for all tests with SGTF = 0. The model was fitted using Markov chain Monte Carlo with 10,000 iterations of burn-in and 5,000 iterations of sampling.
The model above was fitted using the same data source (i.e. SGTF frequencies among Pillar 2 community tests for SARS-CoV-2) as our survival analysis. To verify the robustness of this model, we performed a sensitivity analysis using sequencing data from the COVID-19 UK Genomics Consortium 10 downloaded from the Microreact platform 11 on 11 January 2020 to estimate pVOC. In this alternative analysis we estimated pVOC for each NHS England region and date as the number of samples that were VOC 202012/01 (i.e. lineage B.1.1.7 with mutations ∆69/∆70 and N501Y in Spike) divided by the number of samples that were SGTF (i.e. any lineage with ∆69/∆70, the deletion that causes SGTF) for that NHS England region and date, setting pVOC = 1 for all dates later than 31 December 2020 as there were no sequencing data available past this date, and filling any gaps in the data using linear interpolation. This yielded nearly identical results in our survival analysis compared to using the modelled pVOC described above (Fig. 2e).
Absolute risks -Estimates from the final Cox models were used to obtain estimates of absolute risk of death for 28 and 60 days with SGTF and pVOC. Given the strong influence of age on risk of death, we present absolute risks by sex and age group (1-34, 35-54, 55-69, 70-84, 85+). Absolute risks of death (case fatality rate) within 28 days were estimated by age group and sex using data on individuals tested during August-October 2020; this is referred to as the baseline risk. The absolute risks of death for individuals with SGTF were then estimated as follows. If the baseline absolute risk of death in a given age group is 1 − , then the estimated absolute risk of death with SGTF is 1 − '( , where HR denotes the estimated hazard ratio obtained from the Cox model assuming proportional hazards. Standard errors are obtained via the delta method, and CIs based on normal approximations. Sensitivity analyses -Several sensitivity analyses were performed. After establishing the final model through using the process outlined above, we investigated the impact of using different variables for stratification of the baseline hazard measuring region at a coarser level (UTLA, or NHS England region), as well as coarser test specimen time (week rather than exact date). Adjusting for these variables instead of using stratification was also explored. We also repeated the main analysis restricting data to specimens collected from September onwards, October onwards, November onwards, or December onwards.
To assess the impact of imposing an administrative cutoff to follow-up time of 10 days prior to data extraction, we first reanalysed the data without this cutoff, as well as reanalysing the data restricting the analysis to individuals with at least 28 days' follow-up.
Finally, we adjusted for symptomatic status associated with the test (asymptomatic versus symptomatic), which relates to whether the test was given for asymptomatic screening purposes or on the basis of a request by a (presumed symptomatic) individual, as only symptomatic individuals may request a community SARS-CoV-2 test in England.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; Extended Data Fig. 1. Missingness in SGTF status and proximity to SGTF-capable Lighthouse lab. The geographical location of the six Lighthouse Labs in the United Kingdom; missingness is higher in the lower-tier local authorities (shaded regions) which are closer to a Lighthouse lab that is not capable of producing an SGTF reading.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Extended Data Fig. 2. Kaplan-Meier plots of survival within 60 days of positive test for SGTF versus non-SGTF. Plots are stratified by sex, age group, place of residence, ethnicity, NHS England region, IMD decile (in 5 groups), and specimen date. Note that y-axis ranges differ among panels. These curves show the crude survival within each group (unadjusted for other covariates), and so do not necessarily signify differences in the effect of SGTF on survival for any specific group due to possible confounding. Shaded areas show 95% CIs.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. Fig. 3. Schoenfeld residuals for survival model by SGTF stratified by LTLA and specimen date. Model uses linear terms for age and IMD and a 28-day followup using complete cases. Schoenfeld residual tests (two-sided) give (a) P = 0.001 for SGTF; (b) P = 0.039 for age; (c) P = 0.101 for sex; (d) P = 0.937 for IMD decile; (e-g) P = 0.969 for ethnicity; (h-i) P = 0.064 for residence type; and P = 0.027 globally. Trend line shows mean and 95% CIs of a loess regression.

Extended Data
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; Extended Data Fig. 4. Comparison of missingness models. Q-Q plot (left; mean and 95% CIs) and distribution of weights (right) under different missingness models assessed for IPW with (a) a cauchit link, (b) a robit link (Student's t distribution with d.f. = 4), and (c) a logit link. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021. ; Extended Data Fig. 5. Misclassification model. For each NHS England region, we fit a betabinomial model (purple, Modelled SGTF) to the observed SGTF frequencies among Pillar 2 tests (black, Observed SGTF), which estimates a constant proportion of "false positive" SGTF samples among non-VOC 202012/01 (i.e., non-B.1.1.7) specimens (orange, Modelled non-VOC SGTF) and a logistically growing proportion of VOC 202012/01 (i.e., B.1.1.7) specimens over time (blue, Modelled VOC). This allows us to model the conditional probability that a specimen with SGTF represents VOC 202012/01 (teal, P(VOC|SGTF)). For our misclassification survival analysis, pVOC = 0 for non-SGTF specimens and pVOC = P(VOC|SGTF) for SGTF specimens. Lines show medians and shaded areas show 95% credible intervals. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 5, 2021 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 5, 2021. Extended Data Fig. 7. S gene dropout in community tests relative to a random sample of SARS-CoV-2 infections in the community. Comparison of the proportion of samples with S gene dropout in the Pillar 2 (i.e., community testing) sample used in this analysis compared to Office for National Statistics (ONS) random sampling of the community. This comparison suggests that S gene dropout samples are not overrepresented in testing data relative to the prevalence of S gene dropout in the community, suggesting that the increased hazard of death among positive community tests estimated in this study is not the result of a decrease in the average propensity for test-seeking among individuals infected with B.1.1.7. Point and ranges for ONS data show mean and 95% credible intervals. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 5, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021

Models with interaction terms
In the main text (section "Cox regression analyses"), we briefly describe results obtained from models including interactions between SGTF and other covariates, as well as interactions between other covariates and time since positive test, in our complete-cases analysis.
As stated in the main text, in our analysis of the effect of SGTF on the hazard of death due to COVID-19, we found no evidence that the effect of SGTF varied by age group, sex, IMD, ethnicity, or residence type. In an earlier analysis using data up to 25 January 2021, we did find a marginally significant interaction (likelihood ratio test ( " " = 6.8) = 0.034) between SGTF and residence type indicating a higher hazard of death due to SGTF in care/nursing home residents, but this interaction is no longer present (likelihood ratio test ( " " = 0.33) = 0.85) when analysing the more complete data set up to 14 February 2021.
We also describe in the main text a significant interaction between SGTF and time since positive test (henceforth, "time since positive test" is abbreviated as "time", not to be confused with the date upon which the specimen was taken). In additional analyses, we also found significant interactions between age and time, sex and time, and place of residence and time.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 5, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021