Spatiotemporal droplet dispersion measurements demonstrate face masks reduce risks from singing

COVID-19 has restricted singing in communal worship. We sought to understand variations in droplet transmission and the impact of wearing face masks. Using rapid laser planar imaging, we measured droplets while participants exhaled, said ‘hello’ or ‘snake’, sang a note or ‘Happy Birthday’, with and without surgical face masks. We measured mean velocity magnitude (MVM), time averaged droplet number (TADN) and maximum droplet number (MDN). Multilevel regression models were used. In 20 participants, sound intensity was 71 dB for speaking and 85 dB for singing (p < 0.001). MVM was similar for all tasks with no clear hierarchy between vocal tasks or people and > 85% reduction wearing face masks. Droplet transmission varied widely, particularly for singing. Masks decreased TADN by 99% (p < 0.001) and MDN by 98% (p < 0.001) for singing and 86–97% for other tasks. Masks reduced variance by up to 48%. When wearing a mask, neither singing task transmitted more droplets than exhaling. In conclusion, wide variation exists for droplet production. This significantly reduced when wearing face masks. Singing during religious worship wearing a face mask appears as safe as exhaling or talking. This has implications for UK public health guidance during the COVID-19 pandemic.

www.nature.com/scientificreports/ while 'coarse aerosols' comprising droplets that are < 100 μm when produced, dehydrate, remain suspended for prolonged periods and travel long distances in the air 15 . They are therefore aerodynamically similar to respirable aerosols, particularly once their reduction in size due to evaporation of water on exhalation from the humid respiratory tract is accounted for [14][15][16] . In terms of viral transmission risk, most expiratory droplets are less than 1 μm in diameter, but speech and singing both produce additional particles which peak in size at 3.5-5 μm 12,13,17 . Droplets with initial diameters of less than 1-3 μm are unlikely to contain significant viral load when compared to the coarser sized particles produced during vocalisation 18,19 . A further consideration is the ventilation characteristics of the space. Singing could potentially lead to heavier breathing and greater production of plosive sounds such as "p", both of which are associated with conical, jet-like flows 20 . Even with social distancing, poor ventilation could lead to a high risk of viral infection after as little as 8 min of contact 21 .
In contrast, surgical face masks can successfully block shedding of coronavirus and other seasonal viruses where the droplet particles are more than 5 μm in diameter 22 . Thus, quantifying droplet generation and reducing transmission in the 1-5 μm range is likely to be of high importance. In this paper, the term droplets will be used generically and refer to both respirable and coarse liquid aerosol particles. Droplets may dehydrate to create solid particles.
Several factors make singing higher risk for transmission of SARS-CoV-2 and other airborne viruses compared to normal speech. These include higher frequencies, continuous voicing and more articulated consonants 23 . Mitigation factors have been studied to make singing safer. Echternach et al. investigated dispersion dynamics of aerosols in 10 professional singers and recommended up to 2.5 m social distance to persons in front and 1.5 m to the side to reduce aerosol droplet spread 24 . Loud singing with a face mask reduced the number of aerosol droplets to a level similar to normal talking, although this was not statistically significant 23 . Interestingly, PHE guidelines suggest that the evidence for face coverings in singing is uncertain, but do state 'their use might be considered as additional precautionary mitigation, where this is practicable' 7 . Single use surgical face masks can capture coarse and fine respiratory aerosol droplets of sizes as small as 1-5 μm 22 and it is possible to make reusable face masks with similar efficacy, although the number of layers within a mask also can influence the number of droplets penetrating through the mask and subsequently breaking down into smaller sized droplets 18,25 .
Planar laser imaging (PLI) can capture images of droplet transmission at speeds of up to 3000 frames per second (fps). Previous studies of respiratory droplet transmission have either examined coughing 26 , have been undertaken at low time resolution (such as 125 fps 23 ) or have not demonstrated statistically significant differences between people in the spatio-temporal evolution of droplets when a verbal task is performed 23 . This is obviously important for understanding disease transmission risks. Furthermore, such high-speed imaging techniques have not been exploited to thoroughly investigate the effect of wearing a face mask during singing to the best of our knowledge. We sought to explore all these issues in our study, using PLI with high-speed image capture in a large cohort of volunteers while speaking or singing with and without face masks 27,28 .

Aims
The aims were to investigate the differences in droplet transmission between a variety of vocal tasks including singing, and to offer insights on the effect of wearing a face mask on droplet transmission.
There were four specific objectives: 1 To understand the differences in droplet transmission between different vocal tasks. 2 To examine inter-participant variability, with particular interest in singing. 3 To explore the relative difference in the number of droplets for singing compared to speaking and exhaling when wearing a face mask. 4 Finally, to explore the concept of 'Super Emitters' by analysing whether any participants transmitted many more droplets than others for individual tasks.

Methods
Study population. The CONFESS study (COvid aNd FacE maSkS) was designed as a cross-sectional study to assess the safety of singing in religious worship during the COVID-19 pandemic. Participation was voluntary. Participants were recruited via social media, traditional media outlets including BBC News and targeting of religious groups 29 . Participants gave informed consent and enrolled in the study on the dedicated website (https:// www. confe ss-study. co. uk/) and completed a study questionnaire between September and October 2020. A subgroup of 20 people were invited to participate in experiments at the Mechanical Engineering Department in University College London, London, UK. This group gave additional informed consent and was chosen to represent a wide range of demographics including age, sex, racial background and body habitus. Experiments took place in October 2020 but were curtailed when the second UK-wide lockdown was announced at 3 days' notice 30 .
Experimental design. Participants completed vocal tasks in a specifically designed apparatus ( Fig. 1 and Supplementary Fig. 1). All methods were carried out in accordance with relevant guidelines and regulations, including institutional and nationally mandated COVID-19 and laser safety precautions. Vocal tasks were completed without a face mask and subsequently while wearing a type IIR surgical face mask (OPROtec, Hemel Hempstead, UK). In all cases only a single performance of each task was recorded per participant (e.g., a single utterance of 'hello'), due to time constraints. These tasks were: These words were chosen as they included a combination of voiceless fricatives such as /h/ and /s/, which are consonants produced by forcing air through a channel created by certain movements of the articulators such as lips, tongue, teeth or palate as well as the voiced consonants /p/ and /b/ which are produced with vibration in the vocal folds 26 . To simulate real-life, participants were instructed to say the words as if speaking to somebody in a real conversation and to sing in their typical voice. They were specifically told 'to sing in a key and volume you find pleasurable and comfortable' . A sound level meter (Sauter SU130, Balingen, Germany) was placed 300 mm forward of the participant's mouth to capture the far field sound intensity.
Droplet detection and validation. Droplets were detected using a laser planar illuminated imaging technique, PLI (Fig. 1). In brief, participants stood in a laser safe booth with a head brace to minimise movement ( Supplementary Fig. 1). A 0.5 mm thick laser sheet was produced in front of the participant transverse to the exhaled air flow direction using Coherent MX SLM (1 W 514 nm) continuous laser with LASERPULSE light arm and sheet optics (TSI, Shoreview, Minnesota, USA). Two Phantom VEO 710 cameras (1280 × 800 pixels) (Vision Research, Wayne, New Jersey, USA) fitted with NIKKOR 50 mm lenses (Nikon, Tokyo, Japan) were used to capture high-speed photographs of droplets as participants completed the vocal tasks. To avoid the laser shining directly onto the participants' faces, a 25-30 mm gap in front of the mouth was not illuminated (Fig. 1a). Images were captured for 2 s at 3000 frames per second (fps) for all tasks except singing 'Happy Birthday' where the frame rate was 1000 fps as the expected increased task duration would require greater data storage capacity which was constrained. Exposure time was kept constant for all tasks at 0.3 ms.
We performed a validation to ensure that the laser could illuminate the droplets generated by a nebuliser, which are typically 0.3-4 µm, as measured by an aerodynamic particle sizer 11 . We used an Omron NE-C28P medical compressor nebuliser (Omron, Kyoto, Japan), which produces droplets with a mass median aerodynamic diameter (MMAD) of 3 µm. Our imaging system readily detected the emitted droplets from the nebuliser. Therefore, we are confident we can detect particles of at least 3 µm, i.e. within the respiratory aerosol mode, although a precise lower limit of detection is hard to define 12,23 . For emphasis, our study was not designed to size droplets being emitted; rather we wanted to measure the relative amounts of droplets emitted by participants during the different tasks.

Image analyses.
Once images were captured, they were entered into an in-house detection script written in MATLAB software version 2020a 31 ; this provided positional and spatial concentrations of illuminated droplets. Droplets were imaged in an image window of 170 mm × 110 mm × 0.5 mm, which was fixed for all participants and tasks. In all cases measurements were taken from an analysis window which was a 25 × 100 mm area located 35-60 mm from the mouth. We employed Particle Tracking Velocimetry (PTV) to track individual We also used INSIGHT 4G Particle Imaging Velocimetry (PIV) software 33 . This measured the distance travelled by an individual droplet across two serial images. PTV accuracy is dependent on a higher number (100 s) of images containing significant number of particles. For PIV the concentration of droplets must be higher for valid measurements to be attained.
We used the following measures to quantify both the numbers and velocity of droplets detected when participants were performing vocal tasks: This provides a measure of the speed of droplets at a fixed point of 30 mm from the mouth, with droplets having transited through a mask if worn, using both PTV and PIV methods during the most significant vocal event, which was defined as longest sequence of images with the highest total number of droplets. Images were only analysed when droplets were detected for at least 200 continuous frames (approximately 0.067 s) to ensure the PTV tracking software did not generate spurious results. • Time averaged droplet number (TADN): the sum of droplets over the duration of the task divided by the time taken. Significant portions of each recording showed no droplet transmission; these time portions were removed to improve statistical accuracy. Where minimal droplets were exhaled (e.g. participants 2 and 5 for 'snake' in Fig. 2B), an arbitrary minimum of 500 ms was chosen to ensure the entire task was assessed, and where a face mask was worn with no transmission of droplets, the identical time segment was chosen to compare droplet transmission between wearing and not wearing the mask. Data analyses. Results and statistical tests were performed using R software 4.0.4 34 . We used descriptive statistics for demographic data. Normally distributed data, such as sound intensity was evaluated using Student's t tests. Figures were created using the ggplot2 package 35 . Data was collected serially over time for each participant both with and without a mask, giving multiple droplet measurements for each subject. As no droplets were observed in a large proportion of individual measurements, the data was heavily skewed and could not be transformed to a normally distributed scale. As a result, the analysis approach used for TADN was to consider the number of droplets as a count variable. The data was assumed to follow distributions commonly used for count data, specifically the negative binomial distribution. This was preferred to the Poisson distribution, also used for this type of data, due to the large variation in counts.
To allow for the repeat measurements from the same subjects, both over time, and with/without a mask, all analysis was performed using multilevel (mixed) regression models. Two level models were used with individual droplet measurements nested within participants. Specifically, multilevel negative binomial regression was used for the analyses.
For TADN, median and inter-quartile range (IQR) were analysed using the Wilcoxon signed-rank test.
Ethical approval. Ethical approval for the study was obtained from University College London Ethics Committee (Approval: 14223/002).

Results
Demographics. Twenty participants were included in the final analysis; 13 performed a sequence of 4 tasks: exhaling normally, saying 'hello' and saying 'snake' and singing a note. The remaining 7 participants sang the first two lines of the song 'Happy Birthday' . There was inadequate time to perform the other tasks due to the prolonged data storage process after each task and the impending imposition of the second UK lockdown. This necessitated faster turnaround between participants than was ideal. Two participants accidentally knocked the laser apparatus during a single task. We have excluded results from these tasks in our analyses. Participants had a median age of 42.0 (interquartile range (IQR) = 27.0). 14/20 (70%) were female (Supplementary Table 1). Figure 2B illustrates the wide variation in droplet-time profiles of different vocal tasks. Data from all participants are shown in Supplementary Fig. 2. Mean [± standard deviation (SD)] sound intensity for spoken tasks was 71 (± 5.3) dB and 85 (± 7.4) dB for singing (p < 0.001). Mask wearing did not affect sound intensity (p = N.S.). Typically, 'hello' yielded a sharp peak associated with 'h' , while 'snake' had 2 smaller peaks associated with 's' and 'k' . Singing a note led to droplet transmission over a longer period compared to speaking either 'hello' or 'snake' and participants transmitted more droplets. The variation between the participants is seen clearly as is the egress of air in a small number of participants when exhaling or saying 'hello' whilst wearing a mask.

Vocal tasks.
Mean velocity magnitude (MVM). MVM was measured for each participant for each task. Without face masks, median MVMs were similar across all vocal tasks, ranging from 0.47 m/s for singing a note to 0.78 m/s for exhaling (Fig. 3). When wearing masks, for most participants, there was no leakage at all. There was leakage for 2 participants when exhaling and 1 saying 'hello' . In these cases, the MVM was reduced by 85% to 0.19 m/s from 1.28 m/s for exhaling with a mask and to 0.14 m/s from 0.91 m/s when saying 'hello' . Table 1 shows the TADN emitted for each task. Although not normally distributed, the mean was the preferred summary measure as it better represents a measure of the total droplets compared to the median value. The TADN was lowest for saying 'snake' , then exhaling, then saying 'hello' and then singing a note. Of interest, singing 'Happy Birthday' generated a lower TADN than saying 'hello' or singing a note. Figure 4 shows MDNs which demonstrated the same pattern. There was a statistically significant reduction in TADN of 86-99% for all tasks except exhaling where there were several cases of droplet transmission through the masks where TADN fell by 53%. The reductions were statistically significant for all tasks ( Table 1). As droplets were travelling more slowly and were therefore visible in more frames, the actual number of droplets was probably even lower. Similarly, the MDN was significantly reduced by 86% for saying 'snake' , 96% for saying 'hello' and 98% for singing a note. There was a trend to reduction for exhaling (86%) and singing 'Happy Birthday' (94%) which did not reach statistical significance (Fig. 4).

Inter participant variability.
We examined the inter-participant variability, with particular interest in singing a note as this generated the largest number of droplets. Multilevel regression analyses were used to extract the between-subject variance when a face covering was and was not used. A summary of these components of variance is presented in Table 2. The raw values are not particularly interpretable, with the focus on the relative size of the variation in one situation compared to the other. Inter-subject variance was considerably lower when a mask was worn compared to when a mask was not worn. Face masks effectively abolished the difference in transmission of droplets between high and low emitters.
Relative risk of singing with a face mask compared to speaking or exhaling. In Table 3 each task is compared to a baseline task, along with a corresponding confidence interval. The note task was chosen as the reference. There was a statistically significant difference between droplet numbers for all tasks. The number of droplets was lowest for the note task and highest for exhaling and saying 'hello' . Most importantly, neither singing a note or 'Happy Birthday' transmitted a higher mean droplet number than exhaling or speaking. The decreased likelihood of transmitting droplets when speaking or exhaling when wearing masks is also demonstrated in Fig. 2B.
Are there 'super emitters?'. This objective was examined graphically in Fig. 5. The mean number of droplets per patient for each task was calculated. The objective here was to assess whether some participants   www.nature.com/scientificreports/ always transmitted more droplets than others. Data from the 'Happy Birthday' task was not considered as the 7 participants who did this task did not complete the other tasks. Droplets were transmitted relatively consistently between tasks for all participants but three generated dramatically more droplets for singing a note than the others (see also Fig. 2B). These large variations were abolished when participants wore face masks. There was no consistency in which tasks generated the highest numbers of droplets for individual participants.
Correlation with participant physical characteristics. We also assessed whether height, weight, body mass index (BMI) or ethnic background of subjects affected droplet parameters. No consistent findings were observed.

Discussion
Our data offer novel insights into the spatio-temporal evolution of droplets with and without surgical face masks for 20 subjects who completed an array of verbal tasks including singing. Exhaling produced less droplets than saying 'hello' but more than saying 'snake' , both at 70 dB. Singing a note produced the most droplets whereas singing 'Happy Birthday' produced a similar number to speaking despite singing tasks being undertaken at 85 dB. It has been noted that people speak louder when wearing masks 36 . We did not observe this, but we did show a very striking variation between individuals, with some producing almost no droplets, and others producing large numbers, particularly when singing. There was also no consistency between participants as to which task generated the most droplets. This confirms previous studies which have shown that different people emit more droplets for different tasks and suggests that emission may be highly person and task specific 11,12 . Crucially, we demonstrated a dramatic reduction in droplet transmission when face masks were worn and inter-individual variations were abolished. Significant transmission of droplets through the face masks occurred only with exhaling www.nature.com/scientificreports/ and saying 'hello' . Finally, when wearing a face mask, less droplets were transmitted when singing compared to exhaling or speaking. We did not investigate the size of droplets, nor the impact of evaporation, which have been researched extensively by others 12,13,15,17 .
Our results are consistent with other recent work. Alsved et al. demonstrated in a cohort of 12 volunteers that normal singing produced more droplets compared to speech using a combination of a sampling funnel fitted around the participant's face and light scattering spectroscopy in a subset of 5 participants 23 . In a UK cohort of 25 professional singers, wide variation was found in droplet transmission across multiple vocal tasks 12 . Although singing produced more droplets, differences seen were more likely to be due to volume rather than task. The same phenomenon was reported by others 11,23 . We did not control for this so the higher droplet numbers we showed with singing, which was also louder, may have been due to either of these phenomena. We found no sex-related differences, in line with Gregson et al. 12 Our results clearly show that inter-individual variation in droplet transmission is very wide. There are at least 50-fold differences in the maximum and time averaged droplet numbers imaged. Duration of droplet production can range from 10-2000 ms. We were unable to identify any specific characteristics that predicted these variations. Nonetheless our findings may significantly influence future modelling of aerosol transmission and infectivity of SARS-CoV-2 37 .
Our findings also concur with observational real-world data regarding the value of face masks. A recent metaanalysis demonstrated that face masks could lead to a large reduction of risk of SARS-CoV-2 disease transmission (adjusted odds ratio 0.15, 95% CI 0.07-0.34) 38 . It makes sense, therefore, that surgical face masks reduce the transmission of coronaviruses 22 . Laboratory based experiments have similarly demonstrated efficacy, often with at least 80% reduction 36,39,40 . Our data additionally demonstrate that wearing a face mask dramatically reduces the inter-individual variability of droplet transmission while speaking and singing. Asadi et al. demonstrated that wearing a face mask would reduce droplet load from so-called 'super-emitters' and offer some protection to disease spread 36 . Our data would strongly support this supposition.
We did not examine the amount of leakage around the sides of the masks. Viola et al. did this and demonstrated significant leakage jets that may present major hazards 39 . But their study examined coughing where droplets were 50 μm in diameter and peak velocity was 8 m/s despite wearing masks. Verma et al. also explored this for coughs and showed only minor escape around the top of masks and at low velocities 40 . For speaking or singing, droplets are smaller 11 and the velocity without masks in our study was less than 1 m/s. When face masks were worn, transmission velocity was less than 10% of this. Even if there was leakage, it would be at a similar slow velocity which would lead to very slow diffusion of the aerosols away from the speaker or singer 37,40 . SARS-CoV-2 becomes less infectious with time, so even if there is significant mask leakage, the slow diffusion rate will decrease the risk of seeding infection, particularly in a place of worship that stringently enforces social distancing with a short service 10 . But it strengthens the case for using masks which can actively kill virus such as those infused with copper or zinc 41,42 . Wider implications. Consensus opinion has previously been that it would not be safe for singers to rehearse together unless there was a COVID-19 vaccine available and a 95% effective treatment in place 43 . In addition, singing can be an emotive topic so good quality data is needed. Based on the results of this study, we conclude that by wearing face masks, the risks of disease transmission when singing indoors can be reduced to those of sitting quietly or speaking normally. This makes singing no more likely to transmit virus in a communal worship setting. There are benefits of singing, most notably with mental health. An example is the sound of singing in Italian cities, which was used to boost national morale during the first national lockdown 44 . Combined with other mitigation strategies currently recommended such as social distancing, increasing ventilation and reducing volume, we believe it is reasonable for congregational singing to return to places of worship which have remained open during lockdowns in the UK 7 . Indeed, our questionnaire of 1000 worshippers also demonstrates a longing to return to singing, even if it means wearing a face mask and worshippers all report that their places of worship strongly enforce all the government guidelines including social distancing 10 . Allowing congregational singing indoors could vastly improve congregants' worshipping experiences, and restore 'a sense of celebration' 8 .

Limitations.
Our study has several limitations. Firstly, it was not possible to determine the exact lower limit of particle diameter that can be detected by our apparatus. However, our validation test using a medical nebuliser demonstrated that our method could detect droplets which are on average 3 µm in size. The range from 3-5 µm is the key droplet size for disease transmission and we detected this. In addition, while it is possible for bright spots in our images to be caused by artefact, such as by droplets clustered together, we are confident that the probability of this is low. This is because the maximum number of droplets was approximately 100 within a measurement volume of 1.25 cm 3 , giving an overall low droplet concentration. Moreover, we manually inspected all images to check for droplet clustering and we employed thresholding to both blob intensity and size during image analysis to further reduce the risk of erroneous results due to signal noise.
Secondly, our study utilised only surgical type IIR face masks, and hence would not be applicable to all face coverings. In particular, some homemade masks are less effective in reducing droplet transmission compared to surgical face masks 25,40,45,46 . Congregants in places of worship may therefore also need to be provided with additional guidance regarding the type of face coverings to wear. Finally, we only assessed individuals singing in a controlled laboratory. There are additional considerations in a real world setting such as in a place of worship which need to be considered, including ventilation and spacing of congregants.

Conclusions
Our work explored how the aerosol plume evolves as function of space and time and looked at the efficiency of masks with vocal activities. Using high-resolution imaging, we demonstrated the wide variation in droplet transmission with different vocal tasks. Face masks eliminate this variation and are efficacious in reducing droplet spread when singing by reducing transmission as well as velocity of droplets when egress occurs. Face masks could potentially be used alongside other COVID-19 mitigation measures to allow for singing indoors. Our results add to the evidence that supports relaxation of guidance regarding indoor singing.