Ionized carbon as a tracer of the assembly of interstellar clouds

Molecular hydrogen clouds are a key component of the interstellar medium because they are the birthplaces for stars. They are embedded in atomic gas that pervades the interstellar space. However, the details of how molecular clouds assemble from and interact with the atomic gas are still largely unknown. As a result of new observations of the 158 μm line of ionized carbon [CII] in the Cygnus region within the FEEDBACK program on SOFIA (Stratospheric Observatory for Infrared Astronomy), we present compelling evidence that [CII] unveils dynamic interactions between cloud ensembles. This process is neither a head-on collision of fully molecular clouds nor a gentle merging of only atomic clouds. Moreover, we demonstrate that the dense molecular clouds associated with the DR21 and W75N star-forming regions and a cloud at higher velocity are embedded in atomic gas, and all components interact over a large range of velocities (roughly 20 km s−1). The atomic gas has a density of around 100 cm−3 and a temperature of roughly 100 K. We conclude that the [CII] 158 μm line is an excellent tracer to witness the processes involved in cloud interactions and anticipate further detections of this phenomenon in other regions. Mapping the 158 μm line of ionized carbon within the Cygnus region with the SOFIA observatory provides evidence for dynamic interactions between molecular clouds and their atomic envelopes, which trace out the assembly process of cloud complexes.

Molecular clouds are a crucial component of the interstellar medium (ISM) of galaxies as they are the birth sites of stars and planetary systems.However, the processes by which these clouds are assembled from the large atomic hydrogen (H I) reservoir in galaxies is still not well understood.Some models are based on a subtle equilibrium between gravity, turbulence and magnetic fields [e.g.1].An external increase of pressure or turbulence due to stellar feedback or spiral arm density waves then randomly triggers a quasi-static, slow build-up of density, leading to the formation of pockets of gas of molecular hydrogen (H 2 ).Other models [e.g . 2] propose that cloud formation is more dynamic and driven by large scale motions in the galaxy, but still closely linked to the local transition from 1 arXiv:2302.09266v1[astro-ph.GA] 18 Feb 2023 warm (T∼5000 K), tenuous, mostly atomic gas to dense, cooler (T 100 K), partly molecular gas.In this simple two-phase model of the ISM, only the warm and cold neutral medium (WNM and CNM, respectively) are thermally stable.Gas at intermediate temperatures is not in equilibrium and, depending on its density, will either cool down and become denser and fully molecular or heat up to join the WNM.In addition, stellar feedback effects such as radiation, winds, and supernova explosions generate turbulence and complicate the picture.It is thus challenging to find the right observational tracers for both, the dynamic interaction between gas flows and the thermal and chemical transitions between WNM and CNM.
In simulations, dynamic cloud formation scenarios are idealized by low velocity ( 10 km s −1 ) converging flows [e.g.3,4,5,6] which convert diffuse H I gas into dense H 2 gas.A recent study [7] showed that only flows with hydrogen densities 100 cm −3 that collide with velocities 20 km s −1 manage to build up massive structures in which stellar proto-clusters can form.In models with even higher density, the gas flows are already molecular before they collide and are then referred to "cloud-cloud collisions" (CCCs) [8,9,10].Observations with velocities 20 km −1 are reported in [11,12].However, these different scenarios result in contrasting observational predictions.Colliding H I flow models [6] anticipate many velocity components in the lines of ionized carbon ([C II]) and much less in the rotational transitions of carbon monoxide (CO).[C II] emission has its origin in the atomic gas and from non-thermal contributions of multiple molecular clump surfaces at different velocities along the line-of-sight, while CO only arises from the molecular component.Cloud-cloud collision simulations [8] produce two main molecular velocity components visible in CO, with a bridge of emission in velocity space in between the two components.
[C II] emission stems mostly from the envelope of the molecular cloud and the surrounding ambient ISM gas that does not participate in the collision [9].
How can these differing views be confronted with observations?The 21cm line of H I can be observed in emission and absorption, but it mostly fills the interstellar space so that velocity information is highly blurred.CO lines are used as a proxy for H 2 in dense, fully molecular clouds.However, because H 2 self-shields more effectively from ultraviolet (UV) photodissociation than CO, there is a gas component that is mostly dark in CO, but bright in H 2 [13].Fortunately, the [C II] fine-structure line at 158 µm is perfectly suited to determine the physical conditions in atomic and CO-dark molecular gas [14,15,16] and is hence an excellent observational tracer for molecular cloud formation frameworks.
We here present observations in the [C II] 158 µm and CO 1→0 spectral lines of Cygnus X, a region with a coherent network of molecular clouds [17], extending over ∼100-200 pc.It includes the massive star-forming regions DR21 and W75N at a distance of 1.5±0.1 and 1.3±0.1 kpc, respectively, determined with maser parallax measurements [18], and the rich Cyg OB2 association with 169 OB stars [19].Massive star formation occurs in parts of the clouds, for example, the young stellar outflow source DR21, the cluster-forming site DR21(OH), and the cluster of early type B stars in W75N [20].Cygnus X is not exceptional in terms of mass [21], but for its stellar content.Using 12 CO 1→0 observations, it was suggested [22,23] that the DR21 molecular cloud, that has a systemic velocity v = −3 km s −1 with respect to the local standard of rest, collides with the W75N molecular cloud (v = 9 km s −1 ).The [C II] observations reported here indicate a different scenario with an interaction between atomic and molecular gas over a large range of velocities (∼20 km s −1 ).
As part of the SOFIA (Stratospheric Observatory for Infrared Astronomy) FEEDBACK legacy program, the Cygnus X region was observed with the heterodyne array receiver upGREAT (Methods) [24] in the 158 µm line of [C II] at an angular resolution of 14.1 , corresponding to 0.1 pc at a distance of 1.4 kpc.This map of ∼950 arcmin 2 (158 pc 2 ) size is, together with the SOFIA map of Orion A [25], a very large, high angular resolution [C II] map showing a very massive star-forming region that extends well into the outskirts of the molecular clouds.It provides data at sub-km s −1 spectral resolution so that, for the first time, the dynamic assemblage of molecular clouds can be traced in detail.
We smoothed the [C II] map to an angular resolution of 30 and re-gridded to 10 .The data were resampled to a velocity resolution of 0.5 km s −1 and have a mean noise temperature of 0.3 K per channel (Methods and Extended Data Fig. 6).We also use 12 CO 1→0 data from the Nobeyama Cygnus survey [26], smoothed to 30 resolution on a 10 grid and a mean noise per 0.5 km s −1 wide channel of 0.6 K, and H I data from the Canadian Galactic Plane survey [27].The H I data have an angular resolution of 1 and a root mean square (r.m.s.) noise of 3 K in a 0.82 km s −1 channel.

Distribution of CO and [C II] emission in Cygnus X
Figure 1 shows the 12 CO 1→0 and [C II] line integrated emission distribution in the three major velocity ranges in Cygnus X [17].These are the DR21 range (−10-4 km s −1 ), the W75N range (4-12 km s −1 ), and emission between 12 and 20 km s −1 that we call the high-velocity (HV) component.The CO and [C II] emission is mostly concentrated in bright photodissociation regions (PDRs) among which the DR21 ridge and the W75N cloud are well-known starforming sites [28,29].The bright [C II] emission features will be discussed elsewhere, here we focus on the low surface brightness [C II] emission in which the molecular clouds are embedded, in particular on areas that appear devoid of CO emission in the W75N and HV velocity ranges.Figure 1e, f clearly shows that in those regions where CO emission is lower than its 3σ noise level of 3.6 K km s −1 , [C II] intensities are typically 5 K km s −1 (3σ = 1.8K km s −1 ).(See Methods and Extended Data Fig. 7 for details).We thus consider these [C II]-bright areas as CO-dark but recognize that there may be faint CO emission below the detection limit.Substantial [C II] emission at locations of CO-dark gas is also seen in individual velocity channels, portrayed by a movie that scans through all velocities.A one channel snapshot at +16.4 km s −1 is shown in Fig. 2. The rather homogeneous [C II] distribution argues against an origin from photoevaporation flows from the surfaces of the molecular clouds that would be more structured and intense toward the clouds.From scatter plots (Figs.3a and b) of [C II] and CO emission, we calculate a mean [C II] intensity of ∼5 K km s −1 in the CO-dark and [C II]-bright regime and derive average [C II] and CO spectra (Fig. 3c and d) from these pixels (which are obviously not the same for each velocity range).In total, 29% (6%) of the area in the HV (W75N) range is CO-dark and [C II]-bright, compared to 63% (88%) for [C II] and CO-bright gas.These values, however, strongly depend on the total area that was mapped and are subject to a selection effect because the Cygnus [C II] mapping focused on the bright PDR regions and less on the cloud outskirts.The [C II] spectra show a velocity bridge of emission between the clouds, that is, DR21 at −3 km s −1 , W75N at 9 km s −1 and the HV cloud at +15 km s −1 .CO is also present, but clearly weaker, in particular in the HV range.The kinematic connection in [C II] becomes particularly evident in three-dimensional (3D) position-velocity (PV) plots displayed for [C II] and CO in Fig. 4. The strongest emission in both tracers is concentrated in the −3 km s −1 component from the DR21 ridge including the DR21 outflow and in the 9 km s −1 component from the W75N cloud.These bright clouds and PDR regions (yellow in the [C II] image) are embedded in a pervasive medium emitting in [C II] (in dark blue) which is not or only little visible in CO.
Summarizing, we conclude that instead of a head-on collision between a −3 and +9 km s −1 molecular cloud [22,23], we witness here an interaction of several partly atomic flows (seen in [C II]) and partly molecular flows (seen in CO).The next section quantifies this scenario by calculating the physical properties of the interacting gas.

Physical properties of the interacting gas
We estimate the density and temperature of the gas detected with [C II] at velocities v > 4 km s −1 using predictions [30] from the PDR toolbox (Methods) for a [C II] line integrated intensity of 5 K km s −1 .From a census of the 169 OB-stars of Cyg OB2, we derive a Habing field of ∼10 G o (Extended Data Fig. 8), where G o is the mean interstellar radiation field.The PDR model (Fig. 5a) indicates hydrogen densities of ∼100 cm −3 , which is typical for diffuse gas at the transition from atomic to molecular.We exclude here the high-density solution (>10 4 cm −3 ) because then, significant CO emission should have been detected -which is not the case.We note that all numbers have an uncertainty mostly because of the adopted value of the FUV field.With the derived densities, we obtain a surface temperature (Fig. 5b) of 115 K for the PDR gas layer.This is an upper limit for the kinetic temperature T kin of the gas, since the temperature drops entering deeper PDR layers.To narrow down T kin , we performed a study of H I self-absorption (HISA) toward DR21 (Methods and Extended Data Figs.9 and 10) and obtained a gas temperature of ∼100 K.We use this value to calculate C + and hydrogen column densities, N(CII) and N(H), respectively (Methods and Extended Data Fig. 11), and give all input values and results in Table 1.N(H) consists of an atomic and molecular part, and the relative fractions are somewhat variable because the formation of H 2 depends on the local radiation field and density, and on turbulent mixing motions [32] that cause large-and smallscale density fluctuations.We estimate (Methods) that ∼23% of the gas in the W75N range and ∼14% in the HV range is molecular.This is qualitatively in good agreement with results from colliding H I flow simulations [6], predicting that about 20% of hydrogen is in the form of H 2 at densities around 100 cm −3 for the initial phases of ) and CO emission (3.6 K km s −1 ) are indicated by a red and blue dashed line, respectively.The upper left area with red pixels indicates the intensity space that is below the CO noise level but bright in [C II], with an average value of 4.6 K km s −1 (standard deviation 1.5 K km s −1 ) and 5.1 K km s −1 (standard deviation 2.2 K km s −1 ) for the W75N and HV velocity ranges, respectively.c, d, The average spectra over all pixels in the map identified as CO-dark and [C II]-bright in the scatter plot in the W75N (c) and HV (d) velocity ranges, indicated by grey vertical lines.The r.m.s.noise of the spectra for both velocity ranges is 0.075 K for [C II] and 0.15 K for CO, respectively.The [C II] and CO line integrated emissions are 4.7 K km s −1 and 2.2 K km s −1 for the W75N range and 5.4 K km s −1 and 1.3 K km s −1 for the HV range, respectively.cloud formation.Our values are also conform with the results of [16] who find that 20% of [C II] comes from the molecular phase.Their simulation set-up represents a section of the Milky Way disc in which turbulence is injected by supernova explosions but the dynamic effect of gas accretion on to the clouds from the larger scale, galactic environment is retained.However, they investigate only the earliest phases of cloud formation with an UV field of 1.7 G • and lower temperatures of ∼50 K.The masses (Methods) contained in the atomic gas are 7800 M sun for the W75N range and 9900 M sun for the HV range, respectively.This is an important mass reservoir for building up more molecular clouds, comparable to the fully molecular cloud DR21 (∼15 000 M sun , [29]).The time scale for cloud assembly is given by the relative velocity of the components and their size.The column densities of the W75N and the HV cloud translate into a size of 12 pc for a density of 100 cm −3 , leading to an assembly time of 1.3 Myr based on their separation in velocity space by about 10 km s −1 .In a quasi-static scenario, molecular cloud formation would take much longer, about 10 Myr at a density of 100 cm −3 , based on the formation rate of molecular H 2 of 3 × 10 −17 cm 3 s −1 [33].Faster cloud formation with significant fractions of H 2 can be explained, however,  from colliding flow simulations that temporarily create pockets of gas with higher density [35].

Discussion
The observed levels of [C II] intensity in the W75N and HV velocity ranges (excluding bright local sources such as the W75N protocluster) are consistent with the low FUV illumination of ∼10 G o we estimated from the census of the stars from Cyg OB2 at a distance of ∼1.6 kpc [36].The HV gas can thus not stem from the Cygnus Rift, a dark extinction feature at 600 pc with no notable excitation sources [17].Recently, [37] used the GAIA2 data release in combination with extinction to build 3D maps of the dust in the local arm and surrounding regions and confirmed that there is no active star-formation taking place in the Rift.Maser parallax measurements [18] indicate that W75N (1.3±0.1 kpc) is slightly but clearly in front of DR21 (1.5±0.1 kpc) showing that the densest parts of these two molecular clouds could not have collided head-on.Observed absorption features in 12 CO, HCO + , CH + , SH + toward DR21 [22,28,38] support this 3D view (see Methods for details on the Cygnus X complex).In addition, our HISA study toward DR21 identifies broad absorption in the velocity range −5 to 20 km s −1 (Methods).Accordingly, the atomic clouds at red-shifted velocities with v > 4 km s −1 (W75N, HV) must be located in front of DR21 and the dynamics we traced in [C II] indicates that all three of them are clearly on collisional trajectories.Our scenario of molecular cloud + H I envelopes interaction, visible through [C II], actually implies that the DR21, W75N and HV components are not too far separated but should be located within a similar volume with a radius of presumably 20-50 pc.More precise distance estimates would help to test our view.
The composition of the gas seen in low surface brightness C + emission is about 20% molecular and 80% atomic.These values can be compared with the findings from the GOTC+ survey [14].They observed Galactic sightlines with many bright PDRs along the line-of-sight and derived that ∼47% of [C II] emission arises from PDRs, ∼28% from CO-dark gas, ∼21% from cold atomic gas, and ∼4% from ionized gas.Our observations reveal a large reservoir of CO-dark gas of several thousand solar masses, comparable in mass to the active star forming regions DR21 and W75N, that was previously unrecognized.We show that the DR21, W75N and HV molecular clouds cross each other, interacting mostly through their low-density enveloping atomic gas layers.The collision of small-scale H I flows forms dense, molecular gas in the compressed layer of oblique shocks such as proposed in [39,40] but the residual H I is still there and forms a reservoir from which more mass is accreted.We expect that many of these compressed layers show a flattened sheet-like structure, as increasingly seen in observations [41,42], and work on a study to test this scenario.We note that it depends strongly on the density if the interacting gas is also seen in CO (at the same velocity as the [C II]).For the W75N range, we indeed observe a bridge of emission for CO and [C II] at the same velocities, most likely because the molecular fraction is higher.Other studies [10], only using CO, already showed this molecular interaction, also at high velocities [11].To a certain extent, it is also possible that the reason we detect higher [C II] velocities is that some of the CO is already shocked due to the compression and thus at lower velocities.The pre-shock high-velocity gas, however, is at low densities and only visible in [C II].
Our scenario leads to a continuous assembly of more molecular material on very short time scales over ∼ one million years.It is not likely, but cannot fully be excluded, that the molecular clouds formed in the collision zones of interacting expanding H I shells [43] because there are no clear observational signatures for such H I shells and the relative velocities we observe in Cygnus are too large to be only driven by an expanding bubble.
We note that the magnetic field, and in particular its orientation, can also play an important role in this sort of cloud assembly.[5,44] derived that a general alignment of the collision axis and mean magnetic field direction is necessary for direct cloud formation.Recently, [46] showed that under an inclination angle of 45 o , it is possible to accumulate enough mass allowing the formation of massive clusters with O-type stars.Alternatively the observed interaction between the H I/CO complexes have not formed directly the seeds of the CO clouds.Instead we propose that the crossing of the H I/CO clouds we observe have triggered an acceleration of the concentration of these seeds into the dense and massive star-forming clouds through the scenario proposed for Musca by [40].For the quiescent Musca cloud, [40] found that molecular gas formed preferentially at the convergence point of the magnetic field, bent behind the shock front.For Cygnus X magnetic field measurements would thus be crucial to address this important point.
The implications of such a generalized scenario to explain cloud interactions, and consequently star formation, both in Musca and in Cygnus, are important because they suggest a certain degree of universality of cloud formation as a result of interactions between at least partly diffuse H I clouds.This is in line with the scenario of the interplay of feedback bubbles and gravity leading to the formation of dense structures in the multi-phase ISM [42].The main differences between Musca and Cygnus X are the larger initial densities in Cygnus X and the velocity of the collision with ∼20 km s −1 for Cygnus X and less than 10 km s −1 in Musca.Streams of diffuse gas with relative velocities of less than 10 km s −1 are easily justified by turbulent H I gas in the galaxy since the sound speed in the warm neutral medium is of this order of magnitude.The origin of gas streams of more than 20 km s −1 is more difficult to explain.They can be driven by the complex interplay between gravity and stellar feedback effects, and the thermodynamic response of the multi-phase interstellar medium.In galaxy-wide simulations [45] spiral density waves lead to high-velocity collisions of flows that can form massive OB clusters such as the ones seen in Cygnus X.In any case, our finding will have major ramifications for our understanding of molecular cloud assemblage in the Milky Way and other galaxies.Further observations of the extended [C II] emission in the FEEDBACK sample will reveal if this sort of interaction is common to other giant molecular cloud regions.In the future, the GUSTO and ASTRHOS balloon projects will measure the [C II] emission in the Milky Way and in the Large and Small Magellanic Clouds.A promising other tracer for detecting flows of partly atomic, partly molecular gas is atomic carbon ([C I]).It is predicted to also trace the CO-dark gas component [6] and the [C I] 1-0 line is observable from the ground.The GEco project on the upcoming CCAT-prime/FYST telescope will perform extended surveys in this line [47].

Observations and data reduction
The Cygnus X region was observed during several flights in 11/2019, 2/2021, 11/2021 and 4/2022 for the legacy program FEEDBACK, using the 2.7m telescope onboard the Stratospheric Observatory for Infrared Astronomy (SOFIA).The dual-frequency heterodyne array receiver upGREAT [24] was tuned to the [C II] 157.7 µm line in the Low Frequency Array (LFA) 2×7 pixel array and the [O I] 63 µm line (not shown) was observed in the High Frequency 7 pixel array (HFA).The LFA has an array size of 72.6 , and a pixel spacing of 31.8 .The half-power beam width at 158 µm is 14.1 , determined by the instrument and telescope optics, and confirmed by observations of planets.For each flight series, the main-beam efficiencies η mb for each pixel for the LFA and HFA channels were determined, the average value for the LFA is 0.65 and we use this value here.The entire map region was split into multiple square "tiles" with 435.6 on one side and each square was covered 4 times.The on-the-fly (OTF) scan speed was selected to attain Nyquist sampling of the LFA beam (dump every 5.2 ).The total time for one OTF line was 25.2 s, that is together with the OFF observation, within the measured Allan variance stability time of the system.The first two coverages were done once horizontally and vertically with the array rotated 19 • against the scan direction, so that scans by 7 pixels are equally spaced.The second two coverages were then shifted by 36 in both directions to achieve the best possible coverage for the [O I] line in the LFA array mapping mode.Overall, each tile took ≈50 min.to complete.The horizontal and vertical scanning can cause some striping in the maps.In order to reduce these effects, we applied a Principal Component Analysis (PCA) on the data.In short, our method based on PCA uses the information of systematic variations in the baseline from a large set of observations from the emission-free OFF-position that are regularly taken during each scan.These variations are caused by time-dependent instabilities in the backends, receiver, telescope optics, and atmosphere.We produce an "OFF-OFF" spectra by subtracting subsequent OFF positions from each other and calibrate the data in the same way as the "ON-OFF" spectra that contain the astronomical source emission.We then identify systematic "eigenspectra" in the OFF−OFF spectra that account for all or at least most of the structure in the baseline.Using a linear combination of the strongest components, we reconstruct the ON-OFF spectra with the best-fit coefficients for each component.We scale each component by the coefficients that were found and subtract those from the ON-OFF spectra.This procedure removes the systematic variations found in the OFF-OFF spectra but does not make changes to the astronomical line in terms of intensity, width, position etc. in the ON-OFF spectra.We further improve the data quality by using a sophisticated algorithm to determine a set of spectra that are free of emission from the PCA-corrected spectra for each frontend and SOFIA flight.With this set of emission free spectra we employ a second PCA-correction equal to the one described above, but using now these emission-free spectra to determine the baseline "components".With these components we correct systematic variations that happen on timescales of the individual OTF-dumps, which are much shorter than the 10 s integration time for OFF spectra.The final PCA map was then compared to one obtained by removing a polynomial baseline of order 3, performing difference maps, ratio maps, scatter plots etc. and we found no systematic effects.
We then spatially smoothed the resulting [C II] map to an angular resolution of 30 in a Nyquist sampled 10 grid in order to emphasize the large-scale emission distribution and not focus on small-scale variations.Some striping effects are still visible, but they do not produce systematic effects in the data analysis (scatter plots for example).The map center position is at RA(2000) = 20 h 38 m 39.3 s , Dec(2000) = 42 • 20 39.3 and an emissionfree OFF position at RA(2000) = 20 h 39 m 48.34 s , Dec(2000) = 42 • 57 39.11 was used.A Fast Fourier Transform Spectrometer (FFTS) with 4 GHz instantaneous bandwidth [48] with a velocity resolution of 0.04 km s −1 (from the hardware selected frequency resolution of 0.244 MHz) served as backend.We here employ data resampled to 0.5 km s −1 resolution.For further technical details see [49].All spectra are presented on a main beam brightness temperature scale T mb , i.e., corrected for the main beam efficiency.The r.m.s.noise of this final data set (and the one of CO emission) was then determined by fitting a 0th order baseline to each spectrum, excluding the windows with emission.The noise is homogeneous across the maps and follows approximately a Gaussian distribution (Extended Data Fig. 6) with a peak at 0.34 K for [C II] and 0.64 K for CO, respectively.As a second method, we determined the noise by spatially averaging the pixels in emission free channels and obtained values of 0.31 K for [C II] and 0.62 for CO, respectively.For the line integrated intensities, we calculated the σ noise level for the 8 km s −1 wide velocity integration range by σ = √ (16) × 0.6 × 0.5 = 1.2K km s −1 in which 0.6 K is the noise, 0.5 km s −1 is the channel width and 8 km s −1 correspond to 16 channels.The 3σ CO level is thus 3.6 K km s −1 .The equivalent calculation for [C II] delivers a 3σ level of 1.8 K km s −1 .In addition, we show in Extended Data Fig. 7 a PDF of the [C II] bright, CO-dark intensities together with a PDF of the r.m.s.noise to demonstrate that the observed W75N and HV intensities are clearly offset from the noise.We also note that the smoothing is the same for all data and cannot change any structural behaviour.
A potential problem could be that a part of the emission that we find in the W75N velocity range might stem from a [ 13 C II] hyperfine transition at DR21 source velocities.The [ 13 C II] transition splits into three hyperfine components with a relative strength of s 2→1 = 0.625, s 1→0 = 0.25 and s 1→1 = 0.125 caused by the unpaired spin from the additional neutron.The three satellites are velocity shifted ∆v 2→1 = 11.2 km s −1 , ∆v 1→0 = −65.2km s −1 and ∆v 1→1 = 63.2 km s −1 with respect to the [C II] fine structure line [50].The F 2→1 component of [ 13 C II] emission from the bright PDRs at DR21 velocities (−3 km s −1 ) may thus appear at ∼8 km s −1 , close to the systemic velocity of W75N (∼9 km s −1 ).We estimated this contribution by comparing the [C II] emission in the DR21 velocity range with the emission in the W75N velocity range for the [C II]-bright, CO-dark gas and found no correlation between the two quantities at a DR21 [C II] line strength of ∼15 K km s −1 (note that the average [C II] intensity is 5 K km s −1 ).For optically thin [C II] this translates into an expected [ 13 C II] F 2→1 line strength of 0.16 K km s −1 .Even if optical depth effects may increase this value somewhat, it is so small compared to our noise limit that it certainly does not change our quantitative estimates and simultaneously explains that there is no correlation detected.

Photodissociation region modelling
We employ the observed [C II] intensities using the plane-parallel models provided by the PDR toolbox found at https://dustem.astro.umd.edu[30].In short, these models solve the radiative transfer equation with chemical balance and thermal equilibrium for a plane-parallel PDR layer exposed to a UV radiation field, cosmic-rays, and soft X-rays incident on one side.A given set of gas phase elemental abundances and grain properties is calculated, and the emergent [C II] line integrated intensities as a function of density n and radiation field G o in Habing units are given.The beam filling is assumed to be unity, which we consider a good approach for the W75N and HV emission because the [C II] arises from extended, mostly diffuse gas.We here use the model WK2020 that features some updates of the photorates and dependence with PDR depth, 13 C chemistry and line emission, and O collision rates.However, there is nearly no difference in the [C II] model prediction compared to the 2006 model.The specific parameters for the WK2020 model are listed in Table 1 in [30].Some important values are the cosmic ray ionization rate per H nucleus of 2 10 −16 s −1 and the formation rate of H 2 on dust of 6 10 −17 s −1 .
We focus here on modelling only the [C II] emission though the plane-parallel PDR model also predicts a CO (1→0) brightness.However, this value is very sensitive to the assumed total depth of the cloud.The [C II] intensity and the surface temperature trace surface properties while the CO emission only arises from the layers deeper in the cloud where CO can form, this means those that are not CO-dark.As the toolbox models assume a gas column of visual extinction Av = 7, implying a large fraction of CO-bright gas, while the observed gas rather has a column of 3.8 10 21 cm −2 we expect, and see, a significant over-prediction of the CO 1-0 intensity by the model.
Considering the [C II] intensity, we derive a density of n=100 cm −3 for a FUV field of 10 G • .Taking into account the uncertainties in the FUV field, we assume a field that is double (∼20 G • ) and half (∼5 G • ) of the average value and derive then a density range of ∼40 to ∼400 cm −3 .The corresponding surface temperatures are then ∼200 K for n=40 cm −3 and ∼90 K for n=400 cm −3 , respectively.From our HISA study, however, we obtain a temperature of 90-120 K with an average of 108 K, which points towards a density of ∼100 cm −3 and thus to a FUV field of 10 G • .A significantly lower UV field would push the densities towards higher values of more than 10 3 cm −3 , but we note that the observed [C II] isocontour is a very flat curve for high densities.In any case, we would have detected CO emission in densities above 10 3 cm −3 .An even higher UV field has a smaller impact on the density, but would still move values below ∼40 cm −3 .For simplicity, we will use a common temperature of 100 K and a density of 100 cm −3 , but work out the values for column densities fractions using these extreme limits of density and temperature.

Calculation of physical properties
We assume that the major collision partner for C + is atomic hydrogen, and that the [C II] line is optically thin, so that the line emission must be sub-thermally excited, in view of the low densities.
The total hydrogen column density N(H) = N(HI) + 2 N(H 2 ) is estimated from N(CII), assuming all carbon is in the form of C + and applying the abundance C/H = 1.6×10 −4 [31].For the nominal values of T=100 K and n=100 cm −3 , we obtain N(CII)=0.61 10 18 cm −2 and N(H)=3.78 10 21 cm −3 .This total hydrogen column density corresponds very well to the one estimated from our HISA study, assuming that most of the gas seen in [C II] is atomic.We thus consider a density of ∼100 cm −3 as the most likely value for the atomic gas.Nevertheless, with a lower density of n=40 cm −3 and T=200 K, we calculate N(CII)=0.85 10 18 cm −2 and N(H)=5.31 10 21 cm −2 .With n=400 cm −3 and T=90 K, we derive N(CII)=0.20 10 18 cm −2 and N(H)=1.22 10 21 cm −2 .The mass of the atomic gas is then estimated by with the area A in cm 2 , the mass of hydrogen m H in kg and the C/H abundance 1.6×10 −4 [31].

Determination of the molecular fraction
The diffuse gas at high velocities (>4 km s −1 ) is partly molecular and partly atomic.We here give a rough estimate of the molecular fraction, which is defined [51] by The molecular hydrogen column density N(H 2 ) is derived using the average CO line intensities (I CO ) from the spectra in the W75N velocity range (2.2 K km s −1 ) and the HV range (1.3 K km s −1 ).Both values have an error of 1.2 K km s −1 .With a CO-to-H 2 conversion factor X CO = N(H 2 )/I CO of 2×10 20 cm −2 (K km s −1 ) −1 , commonly used in the literature, we obtain H 2 column densities of 0.44 10 21 cm −2 and 0.26 10 21 cm −2 for the W75N and HV velocity range, respectively.The total hydrogen column density N(H) is derived from the [C II] column densities, given in Table 1, with a value of 3.78 10 21 cm −2 in each velocity range.With these values, the molecular fraction is then calculated to be f (H 2 ) = 0.23 for the W75N velocity range and f (H 2 ) = 0.14 for the HV range, respectively, with an average value of f (H 2 ) = 0.19.The portion of molecular gas is thus nearly twice as much in the W75N velocity range compared to the HV range.We note that the H 2 column densities and molecular fractions are lower limits since we use the canonical value of the X CO factor which is mostly valid for evolved molecular clouds.In case the atomic gas has a lower or higher density because of a different incident FUV field, the molecular fractions change accordingly.We estimate that for n=40 cm −3 , f (H 2 ) = 0.17 for the W75N velocity range and f (H 2 ) = 0.10 for the HV range, respectively.In the high density case with n=400 cm −3 , f (H 2 ) = 0.72 for the W75N velocity range and f (H 2 ) = 0.43 for the HV range, respectively.These values are more extreme and less likely than the fractions we obtained with the nominal values but we note that all observational and model values have their uncertainties.

Evaluation of the FUV field
The FUV field in Cygnus X was derived from a census of the stars of Cyg OB2, using the compilation of Wright et al. [19].They listed 169 OB stars including 52 O-type and 3 Wolf-Rayet stars.To determine the UV-luminosity of the cluster, we assume that the spectral radiance of each star can be represented by a black-body: with the wavelength λ, the speed of light c, the Boltzmann-constant k B , the Planck-constant h and the temperature of each star T .To extract the UV portion of the luminosity L, we integrate the Planck function over the UV range between 910 Å and 2066 Å, which corresponds to a photon energy range of 6 to 13.6 eV.The ratio of the integrated spectral radiance over the UV range to the entire black-body spectrum gives us the UV luminosity: with the Stefan-Boltzmann constant σ.The superposition of the stellar UV flux of all stars considered gives the UV-field at every point (RA, Declination) of the grid: where R i is the radial distance to each star.We assumed the most recent distance estimate, based on GAIA, of 1.6 kpc [36] for every star in the cluster to the observer though there can be line-of-sight distance differences between the individual stars.Extinction of the UV field by gas of the interstellar medium is not taken into account, thus the determined UV field is an upper limit.The resulting FUV field is shown in Extended data Fig. 8 where it becomes obvious that the FUV field varies between ∼5 and 20 G o across the [C II] map.We use a value of 10 G o for our PDR modelling.We note that a closer distance of Cyg OB2 of ∼1.45 kpc [52] would increase the UV field to values between ∼10 and 30 G o with little influence on the PDR modelling.

H I self-absorption
Studying self-absorption of atomic hydrogen toward a strong continuum source is a way to determine the atomic column density in front of the source.We follow the approach of [41] to derive the amount of cold absorbing H I in front of DR21, which emits strongly in the 1.4 GHz continuum.Both, the H I 21cm line data and the continuum, stem from the Canadian Galactic Plane survey [27] and have an angular resolution of 1 .First, the method requires to estimate the H I emission by finding a mostly absorption free position (off) as a reference.For that, 100 H I emission spectra around DR21 within a 20 × 20 square were plotted (Extended Data Fig. 9a) and compared to the average H I spectrum towards DR21, showing absorption across a large velocity range.We are mostly interested in the range v = 4 to 20 km s −1 .All off spectra show a similar line profile with a flat-top shape peaking at a temperature of ∼100 K at the absorption dip of the on spectrum.However, the various small absorption dips in the off spectra indicate that the absorbing hydrogen cloud is also extended.As a best candidate we chose a spectrum that shows  9b in blue, together with a Gaussian fit optimized to fit the wings of the off spectrum.It could represent the H I emission if some emission is still absorbed at our off position.We calculated the velocity resolved optical depth for both choices, the off spectrum and the fitted one.This helps us to assess the uncertainties in the derivation of the column density.The optical depth as a function of velocity v is given by τ HISA (v) = − ln 1 − T on−off (v) T HISA − T off (v) − T cont (9) with the continuum temperature T cont (437 K as an average) for a HISA temperature T HISA .The corresponding column number density of the absorbing H I layer, N HISA , can be determined by integration over τ HISA (v): We integrate over the velocity range of 4 km s −1 to 20 km s −1 .The temperature dependent HISA column number density is shown in Extended data Fig. 11 for both, the off position and the Gaussian fit.It becomes obvious that the differences are small.To further constrain the possible range of HISA column density, we determine the maximum HISA temperature by the temperature at the absorption dip minimum at each spectrum of the H I map: T HISA,max = T on,min + T cont .(11) The resulting temperature map is shown in Extended data Fig. 10 and confirms what we already concluded from Extended data Fig. 9, that is, that the HISA temperature can not be higher than ∼ 100 K, which results in a HISA column density of ∼N HISA ∼ 3.5 • 10 21 cm −2 .For a density of 100 cm −3 , this translates into a H I layer of 11 pc which compares well to other dominant molecular structures in Cygnus X, like for example the DR21 ridge with a size of ∼7 pc.
The Cygnus X star-forming complex and the distance problem The Cygnus X star forming region has for a long time been noticed as an outstanding area because it lies around l = 90 o where the local Galactic arm, the Perseus arm, and the outer Galaxy are found along the same line of sight, covering distances between 1 and 8 kpc.Since radial velocities around the tangent point in Cygnus X are close to zero, they do not provide reliable distances.This is why the Cygnus X region was long proposed to be an accumulation of clouds at various distances along the different spiral arms [53].However, based on CO data and arguments of UV-illumination, [17] proposed that most of the observed velocity components seen in CO were associated and part of a single complex despite the very large differences in velocities (∼ −5 km s −1 to 18 km s −1 ).The main molecular cloud regions are DR21 (around −3 km s −1 ) and W75N (around 9 km s −1 ).At that time, it was not clear how such large relative velocities could co-exist spatially and inside a single event of star formation.The scenario of a single complex for Cygnus X [17] has been confirmed by maser parallax measurements [18], obtaining roughly the same distance for DR21 and W75N despite their velocity difference of ∼ 12 km s −1 .Our study now shows that the W75N and the HV atomic/molecular clouds are located in front of DR21 ([C II] emission and H I absorption) and interact with each other.

Figure 1 :
Figure 1: Velocity integrated CO and [C II] emission in the DR21, W75N and high-velocity range.a-c 12 CO 1→0 maps obtained with the Nobeyama telescope [26] in the three main velocity ranges of the Cygnus X region: −10-4 km s −1 (a), −4-12 km s −1 (b), and 12-20 km s −1 (c).The embedded massive star-forming sites DR21, DR21(OH) and W75N are indicated.The yellow polygon outlines the area mapped in [C II] that is shown in panels d-f.The color wedges give the CO and [C II] intensities in square root values.The resolution of the maps (30 ) is indicated in panels a and d.RA, right acension; Dec, Declination.3

Figure 2 :
Figure 2: Movie-snapshot of [C II] and CO emission.The image displays a single velocity channel (16.4 km s −1 ) of [C II] emission in red (0 to 16 K km s −1 ) and CO emission in blue (0 to 25 K km s −1 ), respectively, plotted as square-root values.Areas that are dark in CO (no or little emission) are bright in [C II] and reveal the large spatial extent of [C II] emission.The full movie with all velocity channels is found at https://astro.uni-koeln.de/stutzki/research/feedback/animations.

Figure 3 :
Figure 3: Scatter-plots and spectra of [C II] and CO 1→0 emission.a, b, The pixel-by-pixel correlations in the W75N (a) and HV (b) velocity ranges.The 3σ noise levels for [C II] emission (1.8 K kms −1) and CO emission (3.6 K km s −1 ) are indicated by a red and blue dashed line, respectively.The upper left area with red pixels indicates the intensity space that is below the CO noise level but bright in [C II], with an average value of 4.6 K km s −1 (standard deviation 1.5 K km s −1 ) and 5.1 K km s −1 (standard deviation 2.2 K km s −1 ) for the W75N and HV velocity ranges, respectively.c, d, The average spectra over all pixels in the map identified as CO-dark and [C II]-bright in the scatter plot in the W75N (c) and HV (d) velocity ranges, indicated by grey vertical lines.The r.m.s.noise of the spectra for both velocity ranges is 0.075 K for [C II] and 0.15 K for CO, respectively.The [C II] and CO line integrated emissions are 4.7 K km s −1 and 2.2 K km s −1 for the W75N range and 5.4 K km s −1 and 1.3 K km s −1 for the HV range, respectively.

Figure 4 :
Figure 4: 3D iso-surface rendering of position-velocity cuts for [C II] and CO 1→0 emission over the entire area observed in [C II].The x-and y-axis are offsets in arcmin from the central map position, the z-axis is velocity in km s −1 .The emission starts at the 5σ level for both tracers.The bright star forming cloud DR21 and other dense molecular clouds are embedded in a large scale cloud structure only visible in [C II] (dark blue).An interactive version of these plots is found at https://astro.uni-koeln.de/stutzki/research/feedback/animations.

Figure 5 :
Figure 5: PDR model results.The panels show the parameter space of hydrogen density n and FUV field calculated from the PDR toolbox.a, The dark blue isocontour of the observed [C II] integrated intensity of 5 K km s −1 and the noise r.m.s. of 0.6 K km s −1 in dashed light blue.The estimated FUV field of 10 G • is indicated by a red line.The dotted red lines indicate approximately a FUV field of double and half the value.b, The isocontours of surface temperature from the PDR model.For an average density of ∼100 cm −3 (blue line) and a FUV-field of ∼10 G • (red line), we obtain a temperature of 115 K.The dashed lines for density and temperature indicate how these values change if the FUV field is higher or lower.

Figure 6 :
Figure 6: Extended Data: R.m.s.noise distribution in 0.01 K bins for all map pixels in [C II] (blue) and CO (orange).The peaks of the Gaussian distributions are at 0.34 K and 0.64 K, respectively.

Figure 7 :
Figure 7: Extended Data: Binned (0.02 K km s −1 for the r.m.s.noise and 0.04 K km s −1 for [C II], respectively) intensity normalized PDF (probability distribution function) for the [C II]-bright, CO-dark map pixels together with the r.m.s.noise PDF.The W75N pixels are indicated in dark grey, the HV range in light grey and the r.m.s. in blue.
The [C II] column density N(CII) is then calculated [51integrated [C II] emission I CII in [K km s −1 ], the kinetic temperature T kin [K], and the de-excitation rate C ul [s −1 ]

Figure 8 :
Figure 8: Extended Data: Far-UV field in the Cygnus region.a The large-scale FUV field on a log-scale in Habing units G o , determined from a census of the O and B-stars from the Cyg OB2 cluster that are indicated in the panel.The grey box outlines the area shown in panel b.The observed [C II] emission is overlaid with black contours corresponding to 50 K km s −1 to 210 K km s −1 by 40 K km s −1 .b The FUV field in the DR21 and W75N region on a linear scale with the [C II] contours identical to the ones in panel a.

Figure 9 :
Figure 9: Extended Data: H I spectra toward DR21.a The H I absorption spectrum is shown in black, the 100 colored spectra represent the H I emission within a square of 20 × 20 around DR21 in a grid of 1 .b The black curve is again the absorption H I spectrum toward DR21, the blue curve is the H I emission spectrum (off), and the orange curve shows the Gaussian fit to the blue off spectrum.

Figure 10 :
Figure 10: Extended Data: Maximum possible HISA temperature around DR21.The strong continuum source DR21 stands out as a red spot with high temperatures of ∼175 K. Overall, the HISA temperature ranges between ∼90 and 120 K with an average of ∼100 K.The overlaid [C II] contours go from black to white with six contour levels corresponding to 50 K km s −1 to 210 K km s −1 by 40 K km s −1 .The blue filled circle indicates the location of the off position.

Figure 11 :
Figure 11: Extended Data: HISA column density as a function of temperature.The solid blue curve shows the results for the off position and the solid orange curve for the Gaussian fit to the off position.The blue and orange dashed lines indicate the column density at 100 K for the off position and the Gaussian fit, respectively.

Table 1 :
[31]ark, [C II]-bright gas physical propertiesAverage of the C + line integrated intensity in CO-dark regions.We call [CII]-bright all regions with [CII] emission above the 3σ noise level.The C + column density N(CII) is calculated with eq. 1 using the C + line integrated intensities.The total hydrogen column density N(H) = N(HI) + 2 N(H 2 ) is estimated from N(CII), applying the abundance C/H = 1.6×10 −4[31].