Introduction
The SARS-CoV-2 virus, which causes COVID-19, is primarily transmitted through droplets from an infected person, infected surfaces and through small, airborne particles (aerosols) [Reference Oswin, Haddrell, Otero-Fernandez, Mann, Cogan, Hilditch, Tian, Hardy, Hill, Finn, Davidson and Reid1]. Dynamics of transmission are complex, with major differences in transmission rates associated with where contacts were exposed, household size, symptomatic status and symptom type, vaccination status, socioeconomic status, and national public health policy [Reference Li, Xia, Jiang, Liu, Lei, Xia and Wu2–Reference Rahi, Le Pluart, Beaudet, Ismael, Parisey, Poey, Tarhini, Lescure, Yazdanpanah and Deconinck5]. Spread of the virus mainly occurs between people who are in close contact with each other (less than 1 metre apart) and is associated with poorly ventilated settings, crowded indoor spaces and household contact [Reference Leclerc QJ, Knight, Funk and Knight6]. Incubation period for the virus can range from around 2–18 days [Reference Wu, Kang, Guo, Liu, Liu and Liang7] and a person typically becomes infectious around 2 days before symptoms, but this varies according to host susceptibility, vaccination status, and genomic variant [Reference Yang, Gui and Xiong8–Reference Lee, Rozmanowski, Pang, Charlett, Anderson, Hughes, Barnard, Peto, Vipond, Sienkiewicz, Hopkins, Bell, Crook, Gent, Walker, Peto and Eyre10].
After the World Health Organization declared COVID-19 a pandemic in March 2020, guidelines for public health and social measures were issued recommending contact tracing for at least 80% of new cases [11]. Contact tracing is been widely used for preventing and reducing transmission of a number of infectious diseases and data collected from the process can be used to explore epidemiological questions, such as estimating secondary attack rates, routes of transmission, and risk factors for secondary transmission for many diseases [Reference Lipsitch, Cohen, Cooper, Robins, Ma, James, Gopalakrishna, Chew, Tan, Samore, Fisman and Murray12–Reference Allen, Tessier, Turner, Anderson, Blomquist, Simons, Lochen, Jarvis, Groves, Capelastegui, Flannagan, Zaidi, Chen, Rawlinson, Hughes, Chudasama, Nash, Thelwall, Lopez-Bernal, Dabrera, Charlett, Kall and Lamagni14].
The NHS Test and Trace system (NHS T&T) was established in May 2020 in England for contact tracing of COVID-19 cases and their contacts and holds a record of all confirmed cases of COVID-19 (during the period of operation) and their reported contacts, including relevant demographic information and potential exposure events. Routine contact tracing ended in England in February 2022, at which time 15.8 million cases and >31 million contacts had been managed [15].
Although numerous individual studies have sought to understand COVID-19 transmission in specific settings and population groups [Reference Chung, Kim, Kim and Jung16–Reference Petrik and Mease20], few have quantified transmission in the general population at a national level. The unique nature of the NHS T&T dataset presents an opportunity to accurately assess, with adequate power, the impact of clinical and demographic factors surrounding contact events to inform future public health measures for the control of COVID-19. This is particularly important as understanding the characteristics of cases that are associated with transmission events and the features of the events themselves can help determine which measures are most effective in controlling the spread of COVID-19 and provide evidence for their future use.
Methods
Data sources
Contact tracing data
Routine contact tracing took place in England from 28 May 2020 in the form of a national-level system (NHS Test and Trace; NHS T&T). NHS T&T holds records of cases (individuals testing positive for SARS-CoV-2 by PCR subsequently referred for contact tracing) and individuals who they may have been in contact with during their infectious period. Information recorded includes details of activities undertaken in the 7 days prior to symptom onset (or positive test for asymptomatic cases), locations visited and other geographical and demographic information. Individuals who were referred to contact tracing on multiple occasions (as a result of re-infection) were recorded as multiple cases within the dataset. Due to updates made to the data collection for location settings on 23 October 2020, only data from 30 October 2020 to 23 February 2022 (end of routine contact tracing) was included in this analysis.
Potential risk factors for transmission (candidate variables) were determined based on data availability and completeness, as well as those documented elsewhere [Reference Allen, Tessier, Turner, Anderson, Blomquist, Simons, Lochen, Jarvis, Groves, Capelastegui, Flannagan, Zaidi, Chen, Rawlinson, Hughes, Chudasama, Nash, Thelwall, Lopez-Bernal, Dabrera, Charlett, Kall and Lamagni14]. These included demographic characteristics and clinical information for the case, the type of contextual setting where the contact event took place (such as a workplace setting, shop/supermarket, leisure facility, within the household) and proxy measures for behavioural factors, such as the number of times a contact had been named previously during contact tracing (Table 1). Data on both cases and contacts was captured either by self-completion questionnaire or over the phone by NHS T&T staff. For cases who had previously appeared as a case in the dataset (i.e., the new episode represented a reinfection), individuals were assigned a category to indicate whether they had been a case within 90 days of their sampled case record, more than 90 days ago, or both. These categories align with the testing policy at the time of data collection, which recommended no PCR re-testing within 90 days of a positive result. All cases without a symptom onset date recorded were assumed to be asymptomatic.
* N and percentages are not shown for missing data.
a Percentages may not sum due to rounding.
b IMD (Index of Multiple Deprivation) Decile is the official measure of relative deprivation for small areas in England.
The region was allocated according to UKHSA boundaries based on postcode of the contact, which designate seven regions of England with distinct operational responsibilities for health protection [21]. Where contact postcode was not available, postcode of the case was used.
Immunization data
The National Immunization Management Service (NIMS) is a database which holds records of COVID-19 vaccinations administered to individuals [Reference Allen, Tessier, Turner, Anderson, Blomquist, Simons, Lochen, Jarvis, Groves, Capelastegui, Flannagan, Zaidi, Chen, Rawlinson, Hughes, Chudasama, Nash, Thelwall, Lopez-Bernal, Dabrera, Charlett, Kall and Lamagni14], for public health and health service planning purposes. Contact tracing data was linked to data from NIMS using combinations of NHS number, forename, first initial, surname, date of birth and postcode to obtain vaccination status for cases at the time of each contact event.
Index of multiple deprivation
IMD (Index of Multiple Deprivation) Decile is the official measure of relative deprivation for small areas in England. Deprivation decile was assigned according to the postcode of the contact where available, or the exposing case where it was not.
Definitions
Case definition
A case was a laboratory-confirmed case of SARS-CoV-2, as defined in NHS T&T at the time of record. Due to the changing testing requirements in England, this was defined as either: a laboratory PCR (polymerase chain reaction) test, LFD (lateral flow device) test with confirmatory PCR test, or LFD with no negative confirmatory PCR within a specified period.
Contact and contact episode definition
A contact was any individual named by a case as being a ‘close contact’ between two days before the date of the case’s onset of symptoms (or test date, if asymptomatic or not reported) and the date of contact tracing. Each contact event that resulted in contact tracing of an individual taking place was termed a ‘contact episode’ and was linked to the specific reported setting where contact took place.
Secondary attack rate
The secondary attack rate was defined as the proportion of contacts (as defined above) who became cases themselves within a defined time period.
Transmission events
Transmission events were detected by linkage of contacts to a subsequent case record within the system where they were notified to the system as a case. Where a link could not be made, it was assumed no transmission had occurred. A simple, deterministic model was applied to define transmission based on operational public health guidelines at the time of contact tracing: the contact episode must have occurred within the likely incubation period for the linked case (between 2 and 14 days before symptom onset or specimen date, inclusive). Where multiple links were identified, a rules-based approach was taken in identifying the most likely transmission link, with each case having a single link. This methodology is further described in a Technical Briefing published by Public Health England [Reference England22]. Contact episodes occurring in or among residents of care homes were excluded from the analysis prior to sampling.
Sampling and power
Given that the NHS T&T dataset contains over 30 million contact episodes, simple random sampling was used to draw a main dataset of 550 000 of these (defined above). This sample size was based on a desired power of 80%, significance level α of 0.05, a minimum odds ratio (OR) of 1.1 and inflated using a design effect of 2 to account for the most complex hypothesized interaction (age category and deprivation decile) and for 25% missing data. A second random sample of 550 000 episodes (excluding those already sampled) was drawn to test model fit, by predicting the probability of transmission using the final model built using the main initial dataset.
Statistical analysis
As the data was hierarchical in nature, mixed effects (multilevel) logistic regression models were used with random effects of individuals (using a unique identifier applied to all linked records within the dataset) nested within the region of residence, defined a priori. Likelihood ratio testing (LRT) was used to obtain p-values. Adjusted reference categories were selected to align with those used in existing literature [Reference Ekroth, Patrzylas, Turner, Hughes and Anderson23] or the largest group, and ‘prefer not to say’ was treated separately to missing data (i.e., included in the model) due to evidence suggesting that those who opt out of answering demographic questions tend to share characteristics and therefore the data should not be considered missing [Reference Benndorf, Kübler and Normann24, Reference Warner, Kitkowska, Gibbs, Maestre and Blandford25].
All analysis was carried out in R version 4.0 with the following packages: tidyverse, lme4, broom.mixed, DHARMa. The BOBYQA [Reference Powell26] optimiser was specified for lme4 mixed-effects models for computational purposes.
Single variable analysis
Independent associations with transmission were estimated using a mixed effects (multi-level) logistic regression model. Crude associations with transmission were obtained for each variable by adding that variable separately to a null model containing only the random effects. Estimates were exponentiated to produce an OR (with 95% confidence intervals).
The linearity of continuous variables was assessed using a LRT to measure the cumulative effects of adding the square, cube and fourth power of the variable in three comparisons. A LRT was used to compare model fit at each stage. Where non-linearity was indicated, categorical variables were derived and used in place of the continuous variable to aid interpretation and reduce computation time. Age categories were chosen to align with those used elsewhere [Reference Ekroth, Patrzylas, Turner, Hughes and Anderson23]. Categories for other variables were derived from visual inspection of the distribution of the data.
Multivariable mixed effects modelling
All candidate variables highly significant (p < 0.01) in the single variable analysis were considered for inclusion in a multivariable model using a forward-stepwise approach based on hypothesized impact on transmission. Improvement of fit following each variable addition was assessed using a LRT, with variables retained in the model if fit was significantly improved (p < 0.05). The Akaike Information Criterion (AIC) was also examined in each step of model-building as an additional relative measure of improvement in model fit for borderline significant variables. The final model was obtained when all candidate variables and interactions had been tested in the model, at which point final p-values for categories were calculated using a LRT derived from removing each variable from the final model in turn (k − 1). Multicollinearity was checked by comparing the variance inflation factor (VIF) of each independent variable against the dependent variable. Variables with a VIF >10 considered to be highly correlated and the variable with most missing data from a correlated pair was dropped from the model.
Two-way interactions between age/sex variables, age/setting, and symptom/testing route were considered a priori. Each interaction was added separately to the model in addition to the main effects after all individual candidate variables had been tested, and retained if their inclusion corresponded to a significant (p < 0.05) improvement in fit assessed by LRT.
Variables from the single variable analysis which were considered borderline or non-significant at the 1% level were added to the final main effects model in turn, starting with the most complete, to assess whether they improved model fit.
Assessment of fit and model diagnostics
Residuals of the final model were simulated using the DHARMa package [Reference Hartig27] for R to aid interpretation and visually examined using quantile-quantile plots. The distribution of outliers was examined using a one-sample Kolomogorov–Smirnov test. Standardized residuals were plotted against predictors and the within-group residual distributions were visually assessed for uniformity. Model fit was examined using the second random sample dataset and predicting probabilities for transmission for this sample using the final model, comparing this to overall secondary attack rate.
The intraclass correlation coefficient (ICC) was calculated for random effects at a value of 1.0, however random effects were retained in the model regardless of results for discussion purposes.
Supplementary analyses
Two additional analyses were performed to assess (1) the data quality and relative impact of the addition of an ‘episode variant’ variable and (2) the impact of the interval between symptom onset of the exposing case and the contact event on transmission.
Episode variant was defined as the genetic COVID-19 variant of a case assigned by the testing laboratory following genetic sequencing of the case’s results. The aim of this analysis was to assess whether the time variable served as an adequate proxy for transmissibility for circulating variant, as completeness of the sequencing data made its inclusion unlikely.
The time interval (in days) between symptom onset of the exposing case and the contact event was included in the final model (for symptomatic cases only) to assess its impact on transmission.
Results
Study population characteristics
Most of the sampled contact episodes involved a primary case who was symptomatic (n = 470 947; 85.6%), tested via Pillar 2 (community) testing (n = 500 145; 92.3%) and was of White ethnicity (n = 421 495; 79.8%). The primary case was female for 52.6% of sampled contact episodes (n = 288 194) and 63.6% of episodes (n = 349 419) occurred where the case was aged between 19 and 65 years. For around half of contact episodes, the primary case was unvaccinated (n = 270 912; 52.0%).
By month, the highest percentage of recorded episodes occurred in February 2021 (n = 90 730; 16.5%), with successive months showing a decrease over spring and early summer 2021, before increasing again in autumn and winter. In the sampled contact episodes, 99.2% cases and 84.0% contacts completed contact tracing.
Crude secondary attack rates
Overall, 8.7% of contact episodes sampled resulted in the contact latterly becoming a case themselves (Table 1). Among the 550 000 contact episodes sampled, where this information was known, 86.0% of these occurred within a household setting, either as a household member or visitor. The highest secondary attack rate (SAR), 21.3%, was observed among contacts who had been a case at least twice previously (at least once >90 days prior to the contact event and at least once within 90 days of the contact). Higher than average SARs were also observed in household contacts (10.7%), in older age groups (66–79 years, 9.8%; >80 years, 11.2%), those who were contacts of a symptomatic case (9.6%) and those who were contacts during the winter months (range 8.0%–11.3%).
The lowest SAR were observed in April and May 2021 (6.5% and 4.5%), in educational settings (2.8%), healthcare settings (1.5%) and workplaces (3.3%). Personal service (including hairdressers, beauty salons, etc.) settings were also associated with lower-than-average SAR (2.4%). SARs were similar across deprivation decile of residence and ethnic groups.
Multivariable analysis
Model building and diagnostics
Variables described in Table 2 were tested for inclusion in the model as outlined in the methods. Despite initially being non-significant in the single variable analyses, deprivation decile was retained in the model due to significant improvement in model fit. An interaction between symptomatic status and testing pillar also significantly improved fit and was retained in the model.
a aOR = adjusted odds ratio. The model has been adjusted for all variables presented in this table.
b IMD (Index of Multiple Deprivation) Decile is the official measure of relative deprivation for small areas in England.
A total of 439 748 contact episodes with complete data were included in the final model (80.0% of the full sample). The model predicted a mean probability of transmission of 9.8% for the first dataset (used for model building) and 9.6% for the second dataset.
Residual plots were approximately normally distributed but included outliers, which were associated with specific months of the year.
Risk factors for transmission
After adjustment for clinical and personal characteristics (see Table 2), household exposure was estimated to have by far the strongest association with COVID-19 transmission (aOR = 2.64, 95% CI: 1.92–3.63). Of other exposure settings, being a household visitor (aOR = 1.50, CI: 1.09–2.07) or being a contact during travel or commuting (aOR 1.61, CI: 1.08–2.40) were the only settings significantly associated with increased odds of transmission (Table 2). Settings associated with reduced odds of transmission were personal services (hairdressers, beauty salons, etc.) and contact settings where there were ≥3 other recorded contacts in the exposure setting during the contact event of interest (4–9 contacts: aOR = 0.77, CI: 0.75–0.78; ≥10 contacts: aOR = 0.24, CI: 0.23–0.25).
Different times of the year and individual months were associated with higher odds of transmission, in particular the winter months (Table 2). The month with the largest odds of transmission was December (2020: aOR = 1.49, CI: 1.40–1.59; 2021: aOR = 1.34, CI: 1.27–1.42), followed by January (2021: aOR = 1.26).
Being exposed to cases who were older, male and who were unvaccinated were all factors associated with increased odds of transmission (Table 2). With exposing cases aged 19–65 years as the reference group (loosely working age), being exposed to a case aged ≥66 years was associated with significantly increased odds of transmission (66–79 years: aOR = 1.14, 95% CI: 1.09–1.20; ≥80 years: aOR = 1.32, CI: 1.19–1.47). Being a contact of a male case was weakly associated with a higher odds of transmission (aOR = 1.05, CI: 1.03–1.07) and unvaccinated cases were significantly more likely to transmit to contacts than those fully (3 doses) vaccinated (aOR 1.21, 95% CI: 1.17–1.25). No ethnic group was significantly associated with increased odds of transmission.
Contacts who were previously cases themselves were also more likely to become a case again after a contact event, particularly those who had been cases at least twice previously (aOR = 2.39, CI: 2.25–2.54). There was also a substantial increase in transmission odds for those who completed contact tracing (provided complete details into the system) (aOR = 3.19, CI: 3.07–3.31).
Significant effect modification was observed between the testing pillar and the symptomatic status of the exposing case, with the route for testing for the exposing case modifying the effect of the symptomatic status of the case on transmission to their contacts (Table 2). The odds of transmission from a symptomatic case were 71% higher when the exposing case had been tested through community-based testing compared to all other routes (including hospital-based testing and research related testing; aOR = 2.43 vs. 1.40).
Variance components for random effects, the individual identifier variable and the geographical region variable, were of low magnitude (0.00 and 0.03, respectively) but were retained as levels in the model for discussion purposes.
Supplementary analyses
Inclusion of variant in the final mixed effects model did not significantly improve model fit (p > 0.05) and examination of VIFs indicated considerable multicollinearity between time and SARS-CoV-2 variant. Data availability for variant was also low (<25% of cases). As the inclusion of month/year as a time variable significantly improved fit of the model (p < 0.01), variant was not included in the final model.
The timing of the exposure event relative to symptom onset of the exposing case had substantial impact on the odds of onwards transmission (Supplementary Table 2). This variable was categorized due to its non-linearity and to allow convergence of the model, given the number of variables included. Contact events occurring 6 or more days post-symptom onset were associated with the smallest odds of transmission (aOR = 0.54, CI 0.38–0.77). Contact 1–2 days pre-onset and contact 3–5 days post-onset showed the greatest odds for transmission (aOR range = 1.23–1.24, CI range = 1.16–1.33). Full results tables for the supplementary analyses can be found in the Supplementary Material.
Discussion
Despite the volume of accumulated COVID-19 research, there remains little evidence of comparative transmission risk across different exposure settings, and even less with a fully adjusted analysis, considering other potential confounding factors. Here, we have exploited the availability of a large, nationally standardized dataset collected for routine contact tracing to obtain robust, independent estimates of COVID-19 transmission risk for different settings in England from October 2020 to February 2022. These estimates are adjusted for temporal changes (including a proxy effect for changing dominant variant), and account for several important individual-level factors. Crucially, linkage to individual vaccination records has enabled estimates of transmission risk to be adjusted for the immunization status of the exposing case.
After adjustment for clinical and personal characteristics, and consistent with other studies [Reference Madewell, Yang, Longini, Halloran and Dean17], household settings were more strongly associated with transmission than any other modifiable factor, both in the context of exposure between residents and from visitors although household secondary attack rate was lower than reported elsewhere [Reference Madewell, Yang, Longini, Halloran and Dean17]. Throughout the pandemic period covered by this analysis, public health policies were informed by modelling studies and largely focused on the prevention of transmission outside households, particularly for large gatherings and where high-risk behaviour was likely [Reference Zhu, Tan, Wang, Guo, Shi and Li28–Reference Althouse, Wenger, Miller, Scarpino, Allard, Hebert-Dufresne and Hu30]. Existing studies analyze the spread of disease between household units, with considerable evidence of high-rate transmission occurring at specific events [Reference Hamner, Dubbel, Capron, Ross, Jordan, Lee, Lynn, Ball, Narwal, Russell, Patrick and Leibrand31–Reference Pauser, Schwarz, Morgan, Jantsch and Brem33], however, our analysis indicates that the overall contribution of transmission at these settings was likely much lower compared to the extent at which transmission occurred between household members. Whilst it is true that the public health messaging at the time may have reduced risk for non-household settings relative to households, leading to perceived increased ‘risk’ of transmission, results from a recent study quantifying transmission risk using COVID-19 app-based contact tracing suggests households accounted for 40% of total transmissions, yet only 6% of contacts [Reference Ferretti, Wymant, Petrie, Tsallis, Kendall, Ledda, Di Lauro, Fowler, Di Francia, Panovska-Griffiths, Abeler-Dorner, Charalambides, Briers and Fraser34]. These findings have implications for public health policy particularly as the emphasis of non-pharmaceutical interventions globally, largely in the early stages of the pandemic, centred around transmission outside of the home, resulting in border controls, closure of schools and workplaces and many national lockdowns. There may be a place for well-communicated, effective guidance specific to household settings to reduce transmission of COVID-19 and other seasonal respiratory viruses should future large-scale outbreaks arise.
Whilst age and sex of the exposing case impacted transmission risk, possibly due to involvement with personal care or differing contact patterns with others, we did not find ethnicity to be significantly associated. In the UK in particular, ethnic group was considered a risk factor for transmission at various stages of the pandemic, hypothesized to be due to differences in household structures characteristic of some cultures and ethnic groups (larger or multi-generational households) [Reference Mathur, Rentsch, Morton, Hulme, Schultze, MacKenna, Eggo, Bhaskaran, Wong, Williamson, Forbes, Wing, McDonald, Bates, Bacon, Walker, Evans, Inglesby, Mehrkar, Curtis, DeVito, Croker, Drysdale, Cockburn, Parry, Hester, Harper, Douglas, Tomlinson, Evans, Grieve, Harrison, Rowan, Khunti, Chaturvedi, Smeeth, Goldacre and Open35]. A similar finding was not demonstrated here, however, it was not possible to include indicators of household size, property type or occupation of household members. The lack of variance explained at an individual-level may demonstrate the importance of contact event characteristics over individual characteristics, for example, a greater risk of transmission due to closer or more sustained contact within some settings, rather than an intrinsically greater risk of transmission from individual age groups or ethnicities.
Additionally, the increased risk of transmission observed in unvaccinated cases reinforces the role of individual health factors in outbreak control policies. Symptomatic cases were significantly associated with increased transmission to their contacts, which supports the abundance of existing literature in this area [Reference Yang, Gui and Xiong8, Reference Slifka, Messer and Amanna36, Reference Tong, Shi, Zhang and Shi37]. We began to explore this further in our supplementary analysis demonstrating a specific window of time from symptom onset to contact associated with higher transmission risk, building on other work which has used contact tracing data linked to genomic sequencing data to explore the role of specific COVID-19 variants and viral load in transmission [Reference Lee, Rozmanowski, Pang, Charlett, Anderson, Hughes, Barnard, Peto, Vipond, Sienkiewicz, Hopkins, Bell, Crook, Gent, Walker, Peto and Eyre10]. Further work is still needed to explore this in relation to newer and more established COVID-19 genetic variants and their respective viral loads and incubation periods. The interaction between symptomatic status and testing pillar is expected, given pillar 2 testing criteria in England required a case to be symptomatic. Being a contact previously appearing as a case within the dataset, specifically as a case more than 90 days and 90 or less days ago from the contact event, was associated with higher odds of transmission, potentially due to the role of this variable as a proxy indicator for high levels of social mixing or poor immune response to the virus.
Limitations and biases
Data on many factors known to be important for SARS-CoV-2 transmission were not captured during routine contact tracing (such as household size, precise proximity, and duration of exposure). While proxy variables were used where possible, the impact of this is uncertain. The genomic variant of SARS-CoV-2 from the exposing case could not be modelled due to data quality issues, however, inclusion of month-time was an adequate proxy for the comparative transmissibility of variants.
The nature and structure of the dataset introduces bias; notably, those who provided more complete details into the system were more likely to be successfully linked to their own records and transmission identified. Similarly, settings where individuals are familiar to their contacts (households, workplaces) are likely to result in more complete contact tracing information and successful linkage. The deterministic detection of transmission events also prioritized contact occurring in households, meaning if there were multiple plausible contact events for a case, events occurring within a household were preferentially attributed to transmission. However, the determination rules were created based on biological plausibility and likelihood and the number of times a contact had previously been named in the dataset was adjusted for in the analysis.
Conclusion
During the COVID-19 pandemic in England between May 2020 and February 2022, household exposure was the most important risk factor for COVID-19 transmission. Symptomatic status and vaccination status were also highly important. The prominence of the household setting in COVID-19 transmission highlights a clear need for pragmatic, well-communicated guidance on effective ways to reduce transmission within the home, however, the relative insignificance of other exposure settings in the role of transmission is more difficult to interpret. Linking large, routine datasets to generate empirical evidence for disease transmission can provide valuable insights into infectious disease epidemiology which can be used to inform policy and control programmes. Factoring this into the design of such datasets could provide powerful potential to address evidence gaps where very large sample sizes are required.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S0950268824001043.
Data availability statement
Analyses were limited to the secondary use of datasets routinely collected as part of the public health function of the UKHSA under regulation 3 of the United Kingdom 2010 health services act and therefore ethical approval was not deemed necessary. For the same reason, data cannot be made publicly available. Applications for requests to access relevant anonymised data included in this study should be submitted to the UKHSA Office for Data Release (https://www.gov.uk/government/publications/accessing-ukhsa-protected-data/accessing-ukhsa-protected-data) and may be subject to legal restrictions.
Acknowledgements
The authors thank Louise Coole, Aryan Nikhab, Chris Nugent, and Simon Packer for their comments and feedback during various stages of this work. We also thank all those members of the UKHSA contact tracing data team for their contributions to the management of contact tracing data.
Author contributions
Data curation: C.R., C.T., C.C.; Methodology: C.R., C.T., G.J.H., N.V., C.C., H.L.M.; Validation: C.R., C.T., N.V.; Writing – review & editing: C.R., C.T., C.A., G.J.H., N.V., C.C., H.L.M.; Formal analysis: C.T., N.V., H.L.M.; Conceptualization: C.A., G.J.H.; Investigation: C.A., G.J.H., C.C., H.L.M.; Supervision: C.A., G.J.H.; Writing – original draft: G.J.H., H.L.M.
Funding statement
The work was entirely funded by the UK Health Security Agency (formerly Public Health England) as part of statutory responsibilities. No external funding was received.
Competing interest
The authors declare no competing interests.