key: cord-0716702-xe163r4t
authors: Abkarian, Manouk; Mendez, Simon; Xue, Nan; Yang, Fan; Stone, Howard A.
title: Speech can produce jet-like transport relevant to asymptomatic spreading of virus
date: 2020-06-18
journal: Proceedings of the National Academy of Sciences of the United States of America
DOI: 10.1073/pnas.2012156117
sha: 967f8f472279da545d772cc12d8d04e0679e743c
doc_id: 716702
cord_uid: xe163r4t

Many scientific reports document that asymptomatic and presymptomatic individuals contribute to the spread of COVID-19, probably during conversations in social interactions. Droplet emission occurs during speech, yet few studies document the flow to provide the transport mechanism. This lack of understanding prevents informed public health guidance for risk reduction and mitigation strategies, e.g. the"six-foot rule". Here we analyze flows during breathing and speaking, including phonetic features, using order-of-magnitudes estimates, numerical simulations, and laboratory experiments. We document the spatio-temporal structure of the expelled air flow. Phonetic characteristics of plosive sounds like 'P' lead to enhanced directed transport, including jet-like flows that entrain the surrounding air. We highlight three distinct temporal scaling laws for the transport distance of exhaled material including (i) transport over a short distance ($<$ 0.5 m) in a fraction of a second, with large angular variations due to the complexity of speech, (ii) a longer distance, approximately 1 m, where directed transport is driven by individual vortical puffs corresponding to plosive sounds, and (iii) a distance out to about 2 m, or even further, where sequential plosives in a sentence, corresponding effectively to a train of puffs, create conical, jet-like flows. The latter dictates the long-time transport in a conversation. We believe that this work will inform thinking about the role of ventilation, aerosol transport in disease transmission for humans and other animals, and yield a better understanding of linguistic aerodynamics, i.e., aerophonetics.

The rapid spread of COVID-19, the disease caused by the virus SARS-CoV-2, highlights the lack of guidelines and mitigation strategies for reducing the impact of airborne viruses in the absence of a vaccine. The inherent structural features of the air flows created by exhalation and inhalation during speech or simple breathing could be a potent yet, until recently, unsuspected transport mechanism for pathogen transmission. This important topic surrounding viral transmission has largely been absent from the fluid mechanics and transport phenomena literature, and even absent more generally from quantitative studies of virus transport in the public health realm. We take steps toward quantifying fluid dynamic characteristics of this transmission pathway, which in the case of COVID-19, has been suggested to be associated with asymptomatic and presymptomatic carriers during relatively close social interactions, like breathing, speaking, laughing and singing. We focus on identifying and quantifying the complex flows associated with breathing and speaking; important areas for future research are indicated also. We recognize that much remains to be done, including integrating the findings and ideas here with potential mitigation strategies.

There are many recent news articles reporting on the possibility of virus transmission during everyday social interactions. For example, documented cases include parties at homes, lunches at restaurants [1] , side-by-side work in relatively confined spaces [2] , choir practice in a small room [3] , fitness classes [4] , a small number people in a face-to-face meeting [5] , etc. Also, an editorial in the New England Journal of Medicine summarizes differences between SARS-CoV-1, which is primarily transmitted from symptomatic individuals by respiratory droplets after virus replication in the lower respiratory tract, and SARS-CoV-2, for which viral replication and shed apparently occur most in the upper respiratory tract and do so even for asymptomatic individuals [6] . These differences were suggested to be at least one reason why public health measures that were successful for SARS-CoV-1 have been much less effective for SARS-CoV-2.

Much has been written over many decades about droplet shedding and transport during sneezing and coughing [7] [8] [9] [10] [11] . There remain open questions about the long-range transport of droplet nuclei or aerosols resulting from droplet evaporation [12] , which is important to understand virus transmission from symptomatic individuals in all airborne respiratory diseases. In addition, researchers in the last decades have shown that droplet emission also occurs during speech [8, 9, 13, 14] , yet there are few quantitative studies of the corresponding breathing and speaking flows that provide the transport mechanism for such aerosols. For example, experiments and numerical simulations, based on scale models involving mannequins in rooms, have been used to study droplet transport and potential infection risk, e.g. [15] [16] [17] , including large-scale flow visualization studies of model out-flows [18] [19] [20] and the influence of ventilation strategies [21] .

In this paper, we take first steps towards characterizing the fluid dynamics of speech. For example, questions that motivate our paper include how does an asymptomatic or a presymptomatic individual affect their surroundings by breathing, speaking, laughing or singing? What are the corresponding spatio-temporal features that quantify these changes and how do they affect the transport of exhaled material? Is there a better position or orientation to adopt when in a social interaction at a cafe, party, or workplace to minimize potential risk associated with the exhaled air from a speaker nearby?

We will illustrate that there is a characteristic, time-varying structure to the expelled air associated with conversations. Phonetic characteristics of plosive sounds like 'P' lead to significantly enhanced directed transport, including jet-like flows that entrain the surrounding air. We will show that the transport distance of exhaled material versus time, in the form of three distinct scaling laws, represents the typical structure of the flow, including (i) a short (< 0.5 m) distance, with large angular variations, where the complexity of language is evident and responsible for material transport in a fraction of second, (ii) a longer distance, out to approximately 1 m, where directed transport occurs driven by individual vortical puffs corresponding roughly to individual plosive sounds, and (iii) a distance out to about 2 m, or even further, where spoken sentences with plosives, corresponding effectively to a train of puffs, create conical, jet-like flows. The latter dictates the long-time transport in a conversation. Inevitably, there are other complex features, including phonetic structures and the ambient flow, e.g. ventilation, that hopefully will motivate many future studies.

Breathing and speaking are part of our every day activities. We utilize both our mouth and nose. We focus on the dynamics of in-flow and out-flow from the mouth since we believe that they are more directed towards a potential facing interlocutor, and we show how some of the features change between breathing and speaking, and are influenced by distinct features of speech, with consequences for transport of exhaled material. 

The typical human adult has a head with approximate radius 7 cm. We may define the characteristic length scale of the mouth, whose shape is approximately elliptical, with the radius a of a circle having the same surface area. Measurements show that the average mouth opening areas are approximately 1.2 cm 2 for breathing and 1.8 cm 2 (with peak values of the order of 5.0 cm 2 ) for speaking [22] . For an order-of-magnitude estimate of the Reynolds numbers, a = 1 cm is chosen. It is perhaps surprising to many that typical air flow speeds are u ≈ 0.5 − 2 m/s (volumetric flow rates ≈ 0.2 − 0.7 L/s) when breathing and u ≈ 1 − 5 m/s (volumetric flow rates ≈ 0.3 − 1.6 L/s) when speaking; see Table I . When breathing, exhalation and inhalation occur approximately evenly over a cycle with period about 3 − 5 seconds [22, 29] , while during speaking the exhalation period is generally lengthened so that 2/3rds or even greater than 4/5ths of the time may be spent in exhalation.

The local fluid mechanics of exhaled and inhaled flows of speed u are characterized by Reynolds numbers Re = 2ua/ν (the kinematic viscosity of air ν ≈ 1.5 × 10 −5 m 2 /s), which have typical magnitudes Re= O 7 × 10 2 − 3 × 10 3 when breathing and Re= O 1 × 10 3 − 7 × 10 3 when speaking. Inertial effects are expected to dominate these flows, which will also generally be time dependent and turbulent, as discussed below.

We characterize first the nature of breathing and blowing flows (Fig. 1) . We set up a laboratory experiment with a laser sheet (1 m × 2 m × 3 mm), where no light hits the speaking subject, who sits adjacent to the sheet. A fog machine generates a mist of microscopic aqueous droplets whose large-scale motions are observed with a high-speed camera oriented perpendicular to the sheet. We obtain the velocity field of exhalation (both during breathing and speaking) by observing how the air stream drags and deforms the cloud in the sheet of light using correlation image velocimetry (see typical images in Fig. 1A and C, with details in Materials and Methods).

The flows are qualitatively similar during breathing or strong blowing ( Fig. 1A and C) , though the velocity magnitudes can be quite different ( Fig. 1B and D) . For instance, typical velocities observed in the air flow while breathing with a slightly open mouth (∼ 1 cm × 2 cm) remain of the order of 0.3 m/s to 1 m/s as visible in Fig. 1B (see Movie S1 in Supplementary Information (SI)), while velocities can be as high as a few meters per second in the blowing stream ( Fig. 1D ) (see Movie S2 in SI). Most significantly, a jet-like, conical structure is visible for the two different situations as depicted by the white lines in Fig. 1A and C, with a cone angle 2α ≈ 20 • . We can expect stronger propagation when breathing after exercising, as the volumetric flow rates are increased, which could make breathing in such a case closer to blowing. These observations call for comparison for the more complex situation relevant for pathogen transport, which is the case of speaking, where aerosols are produced during speech [13, 14] . Next, though, we comment on a fundamental asymmetry of exhalation and inhalation.

At these Reynolds numbers, we expect exhalation and inhalation to be asymmetric. A reader may be aware that one extinguishes a candle by blowing, but it is not possible to do so by inhalation (Fig. 1E ), which is a characteristic of the flows for breathing and speaking. Long exhalation should produce starting jet-like flows propagating away from the individual over a significant distance of the order of a meter (e.g. Fig. 1A-D) , while inhalation is more uniform and draws the air inward from all around the mouth (Fig. 1F) ; it is this asymmetry that explains the phenomenon related to extinguishing a candle (Fig. 1E) . These out-flows are in fact responsible for transporting large droplets and aerosols away from the speaker.

For such inertially-dominated flows, a continuous or long out-flow should be similar to an ordinary jet [30] , and during the initial instants over a time T the propagation distance, while smaller than the naive estimate L = uT = O(1) m (see below), is still larger than the typical size of the head (e.g. Fig. 1 ). Moreover, since L a, it follows that, in ordinary circumstances, one breaths in little of what is breathed out. Wearing a mask (as recommended as a mitigation strategy for COVID-19) should be expected to produce more symmetric flow patterns during exhalation and inhalation, localizing air flow around the face.

Flows exiting from an orifice are well-known to produce vortices, even in the absence of coughing, and these drive the transport about the head, as evident in Fig. 1 . Speaking introduces two further differences: (i) the typical time of inhalation is about 1/4-1/2 of the exhalation time [29] and (ii) language includes rapid pressure and flow rate variations associated with sound productions (plosives, fricatives, etc.), as previously characterized acoustically by linguists [26] . We also note that the stop consonants, or what are referred by linguists as plosives consonants, such as ('P', 'B', 'K', ... ), have been demonstrated recently to produce more droplets [14] . In these cases, the vocal tract is blocked temporarily either with the lips ('P', 'B') or with the tongue tip ('T', 'D') or body ('K','G'), so that the pressure builds up slightly and then is released rapidly, producing the characteristic burst of air of these sounds; in contrast, fricatives are produced by partial occlusion impeding but not blocking air flow from the vocal tract [31] .

We now visualize flow during speaking, which seems different than breathing as, for instance, when saying a sentence like 'We will beat the corona virus', as shown in Fig. 2A (and visible in the Movie S3 of SI). A color code illustrates the average speeds (averaged over the time to say the phrase), but note that these are not representative of the true instantaneous velocities, which in the remainder of this section were estimated from the movies in the SI. Over the approximately 2.5 s to say the sentence, the air flow is more jerky and changes direction depending on the sound emitted. In this particular case, the sentence contains starting vowels (in 'We' and 'will') and pulmonic consonants as fricatives (as 'V' and 'S' in 'virus') and plosives (like 'B' and 'K' in 'beat' and 'corona'). Three different directions are revealed when averaging the velocity field over the time to say the sentence in Fig. 2A : ' We will beat' being slightly up and to the front with a typical velocity of about 5-8 cm/s, 'the corona' being directed downward between −40 • and −50 • with higher velocities of almost 8-12 cm/s while saying the two syllables 'coro'. Finally, the short air puff FIG. 2: Mean velocity field produced when speaking three different sentences. A color code illustrates the average speeds but note that single images of the magnitude of speeds are not representative of the true instantaneous velocities, which were estimated from the movies in the SI. (A) 'We will beat the corona virus', which is a mixture of vowels, fricatives and plosives. (B) 'Sing a song of six pence' (SSSP) [25] , mainly composed of the fricative 'S' except the last word that starts with 'P'. (C) The distance travelled by the extremity of the air puff as a function of time when saying 'pence' at the end of SSSP for three different runs. (D) 'Peter Piper picked a peck' (PPPP) [25] , which is mainly composed of many plosives 'P'. associated to 'virus' is directed upward at about 50 • with speeds of 5-7 cm/s. We believe that an interlocutor and potential receiver of the exhaled material will be most exposed after a few seconds by the horizontally directed part of the flow, whose velocity reaches, in this case, the ambient circulation speed at about half a meter at most. Next, we illustrate a sentence of the same time lapse of about 2.5 s containing many times the same starting fricative 'S' as in 'Sing a song of six pence' [25] with only one starting bilabial plosive sound 'P' in the last word: most of the air puffs produced are emitted downward at an estimated angle of −50 • from the horizontal (and become visible in this sequence only when the air flow hits a nearby table and crosses the laser sheet, see Fig. 2B and Movie S4 in SI). However, a distinct, directed air puff appears in front of the speaker when 'pence' is pronounced (Fig. 2B) , which propagates forward at initially high speeds of about 1.4 m/s as visible in Movie S4, but decelerates rapidly to ≈ 1 m/s at half a meter distance from the mouth; the puff has a speed of 30 cm/s at about 0.8 m (see Movie S4).

These images of typical speech raise the question of the dynamics of individual puffs. In Fig. 2C we report the distance L travelled by the air puff as a function of time t when pronouncing 'pence'. The data demonstrates that the starting plosive sounds like 'P' induce a starting jet flow, which grows initially for very short timescales of under 10-100 ms as t 1/2 , but rapidly transitions to a slower movement characterized by a t 1/4 response, typical of puffs [32] and vortex rings [33] . In fact, when looking at the flow, a vortex ring stabilizes the transport over a distance of almost a meter. This transition between two different dynamics, ending with the dynamics of an isolated puff, is also measured in coughs [10] .

In contrast, when we speak a sentence with many 'P sounds, such as 'Peter Piper picked a peck' (PPPP) [25] , as illustrated in Fig. 2D , the distribution of the average velocity field approaches that of a conical jet with average velocities of tens of cm/s and over long distances of about a meter. Peak velocities are seen at the emission of the sound 'P' with values close to 1.2-1.5 m/s (Movie S5 in SI). This more directed flow situation shares features of breathing and blowing and thus material will be transported faster and further than individual puffs. But, unlike breathing, we believe that this distinct feature of language is more likely to be important for virus transmission since droplet production has been linked to the types of sounds [14] .

It should be evident that language is complicated ( Fig. 2A, B) . Given the possibility of asymptomatic transmission of virus by aerosols during speech, we have focused on the phrases in language, those usually containing plosives, that produce directed transport in the form of approximately conical turbulent jets (Fig. 2D , and also see Figs. 4 and 6 below).

In addition, to see that thermal effects are small until the jet speeds are reduced to closer to ambient speeds,

For a 15 • C degree temperature change in air, ∆ρ ρ ≈ 0.05, so with ∆u ≈ 0.5 m/s and a length scale say ∆z ≈ 0.1 m (which is relatively large), we find Ri ≈ 0.2 < 1. The thermal effects should be expected to be important at longer distances where the jet speed is reduced (usually where the ventilation may also matter) or if a mask is used which decreases the flow speed substantially.

We document the distinct role of the individual plosives in the phrase 'Peter Piper picked a peck' (PPPP) with the time-lapse images displayed in Fig. 3A (see also Movie S5 in SI). By performing correlation image velocimetry to calculate the vorticity field ω = ∇ ∧ u, where u is the in-plane velocity field, as shown in Fig. 3B , we could follow the vortical structures created by the pronunciation of 'P's in PPPP. Vortices shedding from the mouth are clearly visible, interact, and survive downstream where they easily reach the meter scale. The transition from puff-like dynamics associated to single plosives and the development of turbulent jet-like flow during longer sentences seems to be associated with the sequential accumulation of 'puff-packets' pushing air exhaled from the mouth. We will explore this transition in more detail using the numerical simulations below.

To assist with the interpretation of the experimental results just presented, and the numerical results we will report below, for completeness we summarize a few results of well-known mathematical models.

In a high-Reynolds-number steady turbulent jet, it is of interest to characterize the volume flux, linear momentum transport and kinetic energy transported by the jet, as well as the entrainment of the surrounding air that dilutes the jet [34] . These properties also help to understand the fluid dynamics of breathing and speaking. There are at least three significant conclusions that characterize the flow: (i) Denoting the direction of the jet as x, the typical axial speed of the jet as v(x), and its cross-sectional area as A(x), in a steady jet issuing into an environment at a constant pressure, the flux of linear momentum is constant, or v 2 A = constant. If the exit flow near the mouth is characterized by a speed v 0 , volumetric flow rate Q 0 and area A 0 , we conclude that v(x)/v 0 = (A 0 /A(x)) 1/2 < 1. For a conical jet-like configuration of angle α (Fig. 1) , then beyond the mouth A(x) ∝ (αx) 2 . (ii) The corresponding volume flux Q = vA, so that the out-flow leads to a volume flux Q/Q 0 = (A(x)/A 0 ) 1/2 > 1, i.e., there is entrainment of the surrounding air into the jet, which is an important feature of mixing of the surroundings. (iii) Any material expelled from the mouth with concentration c 0 is reduced in concentration as the jet evolves, with c(x)/c 0 = Q 0 /Q(x). Since the jets are approximately conical, then the above results predict that the characteristic quantities vary with distance

Although these arguments are based on the assumption of a steady jet, we shall now see that they apply approximately to the unsteady features of speaking on the time scale of many cycles and far enough from the mouth or exit of an orifice.

A jet formed by the sudden injection of momentum out of an orifice is referred to as a starting jet. Such flows reach a self-preserving behavior some distance downstream of the source, where the penetration distance grows over time like L ∝ t 1/2 [32, 35] ; see also equation (2) below.

On the other hand, a rapid release of air, or puff, injects a finite linear momentum into the fluid, e.g. Fig. 2 . For the inertially dominated flows of interest here, the linear momentum of the puff is conserved, so that the distance travelled is L ∝ t 1/4 [32, 35] , similar to interrupted jets, i.e., starting jets when the flow is suddenly stopped. However, during breathing or speaking, the interrupted jet and the puffs are released one after the other and interact with each other in front of the source, as illustrated by Fig. 3 . The jet is neither continuous like in starting jets nor isolated like in classical puffs. What is then the dynamics of such a "train of puffs"? In the next section, we use numerical simulations to investigate the dynamics of puff trains and quantify their growth in space and time.

To explore quantitatively the various flows we have introduced above, we report 3-D simulations of the incompressible Navier-Stokes equations (the flow speeds are much smaller than the speed of sound). To highlight the dynamics of breathing and speaking, simulations are driven by representative time-periodic flow rate variations [25] from an elliptical orifice comparable to a large open mouth (of radii 1 cm × 1.5 cm). Speaking produces relatively high-frequency changes to the volume flow rate (or fluid speed) during exhalation, though the variations are much smaller than sound frequencies; we do not study the initial formation of the sounds of speech at the glottis [36] . Furthermore, as we have seen above, natural plosive sounds also create special characteristic features that we investigate. Nevertheless, it has to be stressed that the simulations are a model and lack the phonetic complexity introduced by the tongue and the cavity of the mouth, yielding flows directed in front of the mouth only.

We contrast four situations with comparable period and given volumes exhaled and inhaled, with zero net out-flow over one cycle (Fig. 4A-D) : (i) normal breathing with a 4-second period split into intervals of exhalation (2.4 s) and inhalation (1.6 s); (ii) a breathing-like signal but with a (slow) speaking-like distribution of exhalation (2.8 s) and inhalation (1.2 s), (iii) a spoken phrase, 'Sing a song of six pence' [25] , and (iv) a phrase with many plosive sounds, 'Peter Piper picked a peck' [25] . We either ran 1-cycle simulations using the flow rate profiles over a single period, followed by no further out-flow, to quantify a single "atom" of breathing and speaking, and for many periods (or cycles) to understand how the local environment around an individual is established and changes in time. Different Table S1 ).

The results of simulations of these different flow rate profiles are shown in Fig. 4E -H for an exhaled volume of 0.75 L per breath. To visualize the flow, tracers injected at the in-flow are shown, color-coded by the residence time of the tracers. For every case, a conical jet flow is produced, with similar cone angles as well, which is reminiscent of typical features of turbulent jets studied in laboratory experiments and many applications, e.g. [34, 37] ; see also Figs. 1-2. Qualitatively, we observe that breathing produces a jet with an axial flow comparable to speaking, which some may find surprising. Jet lengths in particular are very similar, despite a factor of 2.6 in the peak flow rate of cases P75 and C75 for instance (see Table S1 ). The phrase with plosives produces qualitatively a rougher jet (Fig. 4G ) due to the ejection of vortex rings away from the main jet and vortex interactions. Speaking jets (P75 and S75) yield the largest cone angles and consequently an axial extent somewhat reduced compared to breathing (B75 and C75). Short high-speed puffs associated with speaking thus seem to increase the jet entrainment, but do not enhance the long-range transport in the axial direction.

For all cases, even those with complex phonetic characteristics, we observe that the resulting jets display many of the features of a turbulent jet, which leads to transverse spreading and mixing of the exhaled contents with the environment. These features actually build up over the continual cycles of exhalation and inhalation in both breathing and speaking. Particle residence time (Fig. 4E-H) notably show the progressive formation of the jet. However, a striking feature is the absence of obvious signature of the flow pulsation in the far field. From the global point of view, all computed jets, whatever the details of the in-flow signal, are similar to steady turbulent jets away from the immediate vicinity of the mouth. and turbulent we quantified the cone half angle α by determining the angle inside of which reside 90% of the exhaled tracer particles (Fig. 5A) . The included angles differed from case to case, but were of the order of 10 − 14 • (see Table S1 ). The typical jet lengths, L(t), were also calculated based on the criterion that 90% of the tracers are located upstream of x = L at time t. Raw data of L(t) are presented in the SI, Fig. S2 , and the jet angles are reported in Table S1 . First, higher mean flow rates (exit speeds) produce longer lengths, as expected. For a given exhaled volume per cycle, different types of exhalation produce comparable jet lengths, as suggested by the qualitative analysis of Fig. 4E -H. Modulation of the in-flow signal (cases P and S) systematically tends to increase the lateral growth of the jet, increasing the jet angle and decreasing the jet length.

We ran the multi-cycle simulations over many periods to quantify the development of the transient velocity field. In order to filter the turbulent fluctuations that prevent direct comparisons of the velocity fields as a function of time, we performed time averages over each period (see Fig. 5B ) to produce an approximate profile for the distribution of axial speeds in the exhaled jet. In the far field, though time varying, breathing and speaking may be viewed as periodic processes where the time scales are much longer than an individual period. Moreover, we have already explained that inhalation has little effect on exhalation because of the differences expected of high-Reynolds-number motions. Indeed, when we plot the axial speed as a function of axial distance we find that for each period of exhalation, the axial velocity falls along the curve v(x) ∝ x −1 for both speech and breathing, shown, respectively, in Fig. 5C and D. Not only does the head of the jet evolve as that of a starting jet, but the whole flow downstream of a certain distance from the mouth behaves similarly to as a steady turbulent starting jet. This is particularly striking as the near-mouth flow is laminar and completely different from a steady jet (Fig. 5C-D) . Thus, at the Reynolds numbers characteristic of breathing and speaking, a train of puffs transitions to a turbulent, jet-like flow that dominates the transport associated with breathing and speaking.

For growing jets at constant angle, we can estimate the spreading of the cloud with time. The time t it takes to reach an axial distance from the orifice, or the mouth, is estimated by

or (using a to make the equation non-dimensional)

The scaling from this equation is that expected for starting jets [32] . The theoretical prediction for the length of the exhaled air column for a starting jet (Eq. 2) is then compared to the non-dimensional numerical data in Fig. 5E . The scaling captures quantitatively the trends provided that v 0 is defined as the average speed at the orifice exit (mouth) during exhalation. The peak velocity is not relevant: strikingly, the details of the flow rate signal do not impact the scaling, but only influence the spreading angle of the jet. In addition, for 1-cycle simulations (1P75 and 1S50), we recover that the whole exhaled material acts as a unique large puff [32] , and L ∝ t 1/4 is obtained, which is consistent with the experiments (Fig. 2C) .

These results allow the quantification of concentration of exhaled material in the far field. From the previous results, we expect the concentration field of the exhaled cloud is quasi-steady and falls off with distance, c(x)/c 0 ∝ a/ (αx). Note that for a = 1 cm, α = 10 • , and L = 2 m (the six-foot rule), then for directed jets the concentration of any exhaled material has fallen off by a factor of (a/ (αL)) ≈ 0.03. Typical dilution levels of 0.04-0.05 have been found in the different simulations at 1.5 m, which is consistent with this estimate. It is evident that this result is not an especially large dilution and the concentration is much larger than might be estimated based on a model of diffusion from a sphere.

To complement the numerical simulations and to further characterize the propagation of the exhaled jets we placed a laser sheet perpendicular to a speaker (Fig. 6 inset) . We measured the time t for the laser sheet to be visibly disturbed when placed a distance L in front of the speaker, who said the sentence 'Peter Piper picked a peck' (PPPP) N times. The data of L(t) (circles), including breathing (diamonds), along with the background flow (crosses) and SSSP (triangles), is shown in Fig. 6 . For the plosive phrase, for all N , we observe good agreement with the prediction L ∝ t 1/2 (the solid curve) obtained by representing the far field of the out-flow from speech as a steady turbulent jet. We note that the data for SSSP at long times deviates from the theory, perhaps because of intermittency introduced by only an occasional plosive. The prefactor of the fit of 0.48 obtained for the PPPP data together with breathing data compare well to the scaling law given in Eq. 2: considering a mouth on average opened at a = 1 − 2 cm while saying PPPP, a typical air speed v 0 ∼ 1. transport of the puff during spreading and deceleration (e.g. L > 2 m). The ambient flow introduces uncertainty to these experiments of about 20 % (Fig. 6 ). On the other hand, the existence of an ambient flow is ubiquitous and our results can provide a means to estimate the cross-over between speech-dominated transport and ventilation-dominated transport. Though we do not pursue the topic here, the effect of the ambient flow is an interesting and important problem for further investigation.

We believe that this work is one of the first to quantify the fluid dynamics of the environment about the head of a person while breathing or speaking. Some features are relatively easy to understand, such as the natural asymmetry of exhalation and inhalation, which contribute to the "cloud" of exhaled air being continually pushed away as it mixes with the environment. Taken together, our results have identified three typical regions of transport associated with conversations (i.e., a series of sentences) that contain plosives: (i) Less than about 50 cm from the speaker, exhaled material is delivered in a fraction of a second with flows directed upwards (about 40 • from the horizontal), downwards about 40 • from the horizontal) and directly in front (especially the bilabial plosives), where the latter regime obeys a t 1/2 starting-jet power law; (ii) out to about 1 m, longer, though slower, transport occurs driven by individual vortical puffs created by syllables with single plosives, where the time variation follows a t 1/4 power law; (iii) finally, out to about 2 m, or even further, due to an accumulation of puffs, the exhaled material decelerates to about a few cm/sec and becomes susceptible to the ambient circulation (in our ventilated lab). In this last regime, we discovered that the series of puffs, from plosives in a spoken sentence, produces a conical, jet-like flow, again similar to a starting jet, with a t 1/2 power law.

In the absence of significant ventilation currents, or air motions driven by other speakers, we have seen that often the exhaled cloud will largely be in front of the speaker, with a modest angle as shown in this paper. The dynamics of "puffs" associated with individual breaths or sounds have a distinct dynamics with the very early-time formation phase having a distance that scales with t 1/2 after which the puff advances a downstream distance that varies with t 1/4 ; these dynamics are common to starting jets of all types (e.g. [32] ), including coughs [10] . However, speech is similar to a train of puffs, effectively generating a continuous turbulent jet, which mimics many of the features of exhalation in breathing and speaking, where the local exhaled cloud increases in size approximately as L ∝ (v 0 t) 1/2 ; both longer times and larger flow velocity (or increased breathed volume in the case of exercise for instance) increase the affected environmental volume. Moreover, the droplet emission rate (number of droplets per time) increases with louder speech [13] . With social situations in mind, in hindsight, it should perhaps not be surprising that droplet and aerosol generation, and possible virus transmission, are enhanced during rapid and excited speech during parties, singing events, etc. [3, 4] The results presented in this paper do not account for some real features, e.g. movement of the head or trunk of the speaker and the influence of background motions of the air due to the ventilation. There is obviously much to be done to quantify the many details and nuances, especially as the different sounds in speech produce vortical structures of different strengths that influence the spread (axial and transverse) of the exhaled jet.

The authors are not trained in public health, nor have professional standing in the public health arena, so we should be cautious in conclusions to be drawn from our results regarding social distancing guidelines. Nevertheless, there are general results that can be extracted from this work. Our results show that typical airflow speeds at 1 -2 m distances from a speaker are typically tens of centimeters per second. This means that the ambient air current may be dominant at such distances from a speaker, which makes the definition of guidelines difficult. When thinking about quantitative features to discuss social distancing guidelines (six feet, approximately 2 m, in the United States or 1 m in the World Health Organization's interim guidance published on June 5, 2020 [38] ), both spatial and temporal characteristics matter, e.g., during conversations, the time spent in front of a speaker, and the distance from the speaker, are needed to define an estimate for the dose of virus received; the dose is proportional to the concentration of aerosol at that distance. Based on the experimental and numerical results reported in this paper, exhaled materials reach 0.5 -1 m in a second during normal breathing and speaking, and in fractions of a second in the case of plosive consonants (Figs. 1-3 ). If one is directly in the path of the speaker, then at 2 m and within about 30 seconds, the exhaled materials are diluted to about 3% of their initial value. However, more extended discussions, and meetings in confined spaces, mean that the local environment will potentially contain exhaled air over a significantly longer distance. It follows that in conversations longer than 30 seconds it is better, in our opinion but based on the results in this paper, to move beyond 2 m of separation, and to stand to the side of a speaker, e.g., outside of a cone of 40-50 degrees (half angle), further reduces possible inhaled aerosol. Most significantly, our results illustrate that 2 m, or six feet, does not represent a "wall, but rather that behavior can help minimize risk by increasing separation distances and relative position for longer conversations when masks are not used.

We have provided a quantitative framework to describe a fundamental mechanism of transport that can be generalized to many pathogens. Obviously, much remains to be done for understanding the fluid mechanics associated with simple human activities, i.e., breathing and speaking. Similar ideas apply to other mammals, though the scales are different between a bat, a bird or a cow. Furthermore, many pathogens might have adapted to use the respiratory systems of humans and other mammals as an efficient transport mechanism. Our work will help better understand virus transmission in mammals, which can have catastrophic consequences in nature or affect the food supply. Building on the understanding of the fluid dynamics of viral and pathogen transmission we believe it will be possible to design potential mitigation strategies, in addition to masks, and vague social distancing rules, and link to poorly understood issues of viral dose [39] to better manage societal interactions prior to introduction of a vaccine. We invite researchers to combine the full aerodynamics of sound production, including the different phonetic characteristics, and even sound generation in animals, with droplet formation from saliva and mucus to better understand and describe how airborne pathogen biology is adapted for this mode of transport and transmission.

Short-range airborne transmission of expiratory droplets between two people

Proceedings of Interspeech

The Oxford Handbook of Innovation

Elements of Human Voice

Turbulent Jets and Plumes -A Lagrangian Approach

Advice on the use of masks in the context of COVID-19: Interim guidance

Springer Series: Lecture Notes in Applied and Computational Mechanics

Large Eddy Simulation for Incompressible Flows: An Introduction

We thank the NSF for support via the RAPID grant CBET 2029370 (Program Manager is Ron Joslin). M.A. thanks the IRN "Physics of Living Systems" (CNRS/INSERM) for travel support, as well as K. Meersohn for pointing out the importance of plosives in almost all languages of the world. S.M. thanks V. Moureau and G. Lartigue (CORIA, UMR 6614) and the SUCCESS scientific group for providing YALES2, which served as a basis for the development of YALES2BIO. Simulations with YALES2BIO were performed using HPC resources from GENCI-CINES (Grant No. A006 and A0080307194) and from the platform MESO@LR. S.M. acknowledges the LabEx Numev (convention ANR-10-LABX-0020) for support for the development of YALES2BIO. We thank A. Smits for loaning the fog machine and P. Bourrianne and J. Nunes for help measuring flow rates during breathing.

Due to difficulties imposed by the pandemic, only one subject could enter the lab and participate in the experiments. The subject volunteered for the study, is male and 44 years old, with no known physical conditions. The study was approved by the Princeton University IRB (protocol # 12834). The subject provided informed consent.

In the laboratory experiments, a point-wise laser light (wavelength λ = 532 nm , 1 W power, DPSS DMPV-532-1, Del Mar Photonics) passes through a concave cylindrical lens (focal length F = −3.91 mm) and spreads to form a laser sheet about 2 m in length and 1 m in height. The mean thickness of the laser sheet is approximately 3 mm. To maintain safe use, the laser light shines from above so that no light hits the speaker who sat adjacent to the sheet. Laser safety glasses were worn by the speaker.The flow is seeded by a fog machine (Mister Kool by American DJ), which uses a water-based juice (Swamp Juice by Froggys Fog) and generates droplets with diameters of about one micrometer. The fog can last for tens of minutes and no notable sedimentation of the droplet is observed throughout the course of the experiments. Therefore, the droplets can track the local flow, effectively as passive tracers. Images are captured via a high-speed camera (v7.3, Phantom) with frame rate f = 300 fps (frame per second). However, we note that there is inevitable background flow in the experiments due to the droplet emission by the fog machine, as well as the natural ventilation in the room. Specifically, the background flow is of the order of O(1) cm/s and moves from the left to the right in the experiments reported in the main text (e.g., Fig. 1 ) and only slightly enhances the propagation of the jets. Although we do not pursue it here, the effect of the background flow due to ventilation on the transport of the out-flows from breathing and speaking is an interesting and fundamental problem for future investigations.A similar setup is used when speaking a distance L in front of a laser sheet to determine the axial structure of the out-flows, e.g., the measurement presented in Fig. 6 . The laser sheet is perpendicular to the flow and the camera is perpendicular to the laser sheet.

In order to quantify the structure of the jets from breathing and speaking, the seeded image sequences captured on video are processed using PIVlab [40] . The cross-correlation method is applied to the image sequences to measure the local velocities in the particle image velocimetry (PIV) analyses. Square interrogation windows of 16 pixels × 16 pixels (approximately 2 cm × 2 cm) with an overlap step of 50 % (8 pixels, 1 cm) are used to obtain the velocities, e.g. those presented in Fig. 1B .

The computations are performed with the in-house flow solver YALES2BIO [41] [42] [43] [44] [45] (https://imag.umontpellier. fr/~yales2bio/). These are large eddy simulations [46] , which are well suited to study transport in turbulent flows, in particular in the context of speech production [35] . In addition, they are well adapted to intermittent/transitional regimes [41, 42] . The spatially filtered, incompressible form of the Navier-Stokes equations are solved. The so-called sigma model [47] is used to treat the effect of the numerically unresolved scales on the resolved scales. Particles are injected into the flow to characterize the jets issuing from the orifice (mouth). They are perfect Lagrangian tracers displaced at the local fluid velocity, and do not affect the flow. In the simulations buoyancy effects are not considered; the temperature, density and dynamic viscosity are constant. The geometry of the model of the mouth remains constant over time and does not depend on the type of in-flow signal (breathing or speaking). The mouth opening is an ellipse of semi-axes 1.0 cm and 1.5 cm, which corresponds to the upper limit of the range of mouth surface area observed during speaking [22] . Simulations are performed with different flow rate signals at the in-flow, as detailed in Fig 4. The in-flow signal is perfectly periodic with a fixed cycle duration of 4.0 s for all cases reported in this paper. More details about the physical model, the numerics and the simulations are provided in the SI. Note that we report simulations of turbulent transient flows. Only ensemble averaging could yield results specific to each case and quantify small differences. However, we use simulations to establish trends which are common to the different cases. In the SI, the question of the reproducibility of the results and the influence of the definition of jet characteristics are