## Introduction

Dark matter, whose nature remains elusive, and ordinary matter described by the Standard Model of particle physics, are the strongly clustered materials of our Universe, with the latter component referred to as baryonic matter, or more simply baryons, by observational cosmologists1. The question of how well these two components trace one another, across spatial scales and cosmic time, is central to our understanding of the astrophysics that drives galaxy formation and offers clues to the thermal nature of dark matter and other new physics.

Assuming weak-field (Newtonian) gravitational accretion and collisional shocks under the approximation of spherical symmetry, self-similar solutions2 emerge in which both collisionless dark matter and collisional baryonic fluids develop similar radial profiles when expressed in terms of a characteristic physical radius, the turn-around radius3 at which the perturbed Hubble flow is stationary. A key implication of this simple model is that dark matter and baryons exhibit no radial separation. Collapsed structures, referred to as halos, should retain the cosmic mix of these different fluids at all radii.

The most massive dark matter halos host groups and clusters of galaxies. Early X-ray measurements of the hot gas content of clusters upended the standard cold dark matter (CDM) model orthodoxy of a matter dominated universe4 before observations of Type Ia supernovae ushered in the current ΛCDM model of a universe dominated by a smooth dark energy component5. The argument against a matter dominated universe relied on a fair sampling hypothesis, namely that the mean baryon fraction within clusters (ratio of baryonic mass to total mass) accurately reflects the cosmic mean baryon fraction. A natural consequence of this hypothesis is that the hot and cold fractions of baryons in clusters should be anti-correlated; at fixed total mass, clusters with more cold baryons should have less hot baryons, and vice versa. While this model does not define how baryons are partitioned into these phases, the constancy of the sum implies that a particular system with more hot gas than average must contain a lower stellar mass than average, and vice versa.

However, this simple model ignores important non-spherical and non-gravitational effects such as hierarchical mergers driven by large-scale filaments and the redistribution of energy, momentum and mass (generically termed feedback) by supernovae and active galactic nuclei (AGN). In low mass halos that host only one bright galaxy like the Milky Way, feedback is energetic enough to vent hot gas phase baryons out of these relatively shallow gravitational potentials6. Even smaller halos of dwarf or satellite galaxies suffer severe baryon losses from collective supernova explosions. At the other extreme, the massive halos of rich galaxy clusters have much deeper gravitational potentials that shield them from feedback-driven baryon venting outside of their core regions7. Thus, clusters are likely to be closed baryon boxes, unbiased reservoirs of the cosmic baryon fraction.

Studies of mean trends in gas and stellar mass fraction8,9 support the expectation that massive clusters are more closed than smaller halos, based on the trend of increasing baryon fraction with halo mass. However, measurements of absolute baryon fractions are currently subject to uncertain biases of $${\cal{O}}(10\% )$$ in estimates of total mass, and this systematic uncertainty limits the reliability of comparison with the cosmic mean baryon fraction. We take a complementary approach based on variance about mean behavior, particularly the covariance of hot gas mass and stellar mass conditioned on total mass. This approach is encouraged by recent findings of strongly negative correlation coefficients ($$r \ \lesssim -\!0.5$$) from a pair of complex, multi-fluid cosmological simulations in which this statistic has been measured10,11.

Observational studies have explored baryonic properties conditioned on estimated halo mass, particularly X-ray and thermal Sunyaev-Zel’dovich (SZ) Effect12 signatures from the hot gas phase and optical/infrared properties of galaxies13. While correlations among internal hot gas properties have been measured in a few empirical studies14,15,16,17, the degree of correlation between hot gas and galactic components has not yet been investigated. The minimum requirement for such an analysis is to obtain high quality observations of both stellar and gas properties for a cluster sample with well-defined selection rules and robust estimates of total cluster mass. Currently, these requirements are only fulfilled by the Local Cluster Substructure Survey (LoCuSS), a multi-wavelength survey of the 41 X-ray brightest galaxy clusters at redshifts of 0.15 < z < 0.3.

The LoCuSS sample is selected by applying a redshift-dependent X-ray luminosity (LX) cut to clusters identified in the ROSAT All-sky Survey (RASS) catalog at high galactic latitudes. The multi-wavelength observations used in this study, obtained over the period of a decade (2005–2014) by co-authors, includes optical imaging data from the Subaru 8.2-m telescope, infrared data from the 3.8-m United Kingdom Infrared Telescope on Mauna Kea (UKIRT), X-ray observations from the Chandra and XMM-Newton satellites, and millimeter observations of the thermal SZ effect from the Planck satellite and the Sunyaev-Zel’dovich Array (SZA).

Here we report observational detection of anti-correlation between the hot and cold baryon contents of galaxy clusters. The key measurements are a subset of posterior estimates of 36 pairwise correlations among nine cluster properties derived from the LoCuSS observations, most measured within a radial scale defined by the weak-lensing estimate of each system’s mass18. Details of the galaxy cluster sample and posterior measures of the slope, variance, and LX,RASS–property covariance for nine properties are presented in a companion work19. Our detection of anti-correlation supports independent evidence that massive galaxy clusters retain close to the cosmic mix of baryons and dark matter, a finding that can underpin improved cluster cosmology from cross-wavelength sample analysis.

## Results

### Galaxy cluster property covariance

Table 1 lists the observable properties employed in this analysis. We model the data using a likelihood based on log-normal property covariance about mean scaling relations behaving as power-laws in halo mass20. We employ default redshift scaling behaviors21, but this assumption is unimportant due to the narrow redshift range of the sample. We assume that, on average, weak gravitational lensing measurements provide unbiased estimates of true cluster masses with 0.2 fractional scatter. To model the selection effect, we employ the threshold selection condition for LX,RASS emission used to define the LoCuSS cluster sample. X-ray emission offers clearer identification of massive halos, being less prone than cluster properties measured at other wavelengths to confusion from additional halos projected along the line of sight. While imprecise models of sample selection can bias scaling parameter estimates, we show in the Supplementary Information that inferred correlations among property pairs are insensitive to biases in posterior slope and variance estimates (See Supplementary Figs. 1 and 2).

Our analysis takes a hierarchical Bayesian approach that accounts for the effects of the sample selection, measurement error covariance induced by the use of a common sky aperture and other effects, as well as the halo space density as a function of mass for a ΛCDM cosmology. Uninformative priors are used in the regression; all quoted constraints are derived solely from the sample data. Our model simultaneously constrains the population scaling parameters associated with the multi-wavelength ensemble, the slopes, normalizations, and the mass-conditioned property covariance (see Eq. (2) in the “Methods” section). We report pairwise correlation coefficients, i.e., the covariance divided by the intrinsic scatter of each observable, as in Eq. (3).

An analysis of variance must be cognizant of astrophysical sources of scatter extrinsic to the host halos of the cluster sample. In particular, other halos along the line-of-sight will add correlated noise to some of a cluster’s observed properties22. In the Supplementary Fig. 3 and Supplementary Table 1, we show that such sources of systematic error, including projection, tend to dilute the magnitude of an intrinsically anti-correlated property pair. We argue that it is conservative, then, to consider the measured magnitude of an anti-correlation between stellar mass and hot gas mass as effectively a lower limit to the underlying halo population value.

Table 2 presents the full property covariance matrix at fixed weak-lensing mass derived from the LoCuSS sample. While we report the entire matrix, our focus is mainly on the last two rows and columns that contain optical properties. The lower triangle elements summarize the correlation coefficients of property pairs while the diagonal elements provide standard deviations of each property. Median values from the Markov chain Monte Carlo (MCMC) chains, ~105 in length, are quoted, and uncertainties give 68% confidence limits. As explained below in “Methods”, we impose a minimum value of 0.05 on the intrinsic scatter in the log of K-band luminosity, ln LK,tot, at fixed halo mass when determining these statistics. The upper triangle gives the odds that each element has a sign opposite to that of its median value.

We first note the physically sensible result that the two independently measured properties reflecting a halo’s stellar mass–total K-band luminosity, LK,tot, and optical richness λ – have a strong positive correlation, $$r = 0.77_{ - 0.27}^{ + 0.16}$$. The probability of this value being negative is very small, 1.4%. Of these two measures, the quantity λ appears noisier, with median intrinsic scatter of 25% compared to only 9% for LK,tot, but this may also reflect the different measurement errors quoted for each property. The fractional statistical uncertainties in λ are a factor ~3 smaller than those in LK,tot. As we show in the Supplementary Fig. 2, bias and/or extra noise relative to the underlying halo population statistics will tend to dilute measured (anti-)correlations, and these effects can explain why some galaxy-hot gas property pairs yield weaker evidence of anti-correlation.

### Property covariance between hot and cold baryons

In the bottom two rows of Table 2, the elements linking galaxy and hot gas properties are mainly negative. Figure 1 highlights the correlation coefficients between the galaxy measures and two key measures of hot gas: the core-excised X-ray luminosity and the derived gas mass. Boxes show inner quartiles (25–75-percentile) and whiskers encompass the inner 95% of the marginalized posterior distributions. All pairs tend to be negative, as anticipated by the correlations between hot gas mass and stellar mass seen in hydrodynamical simulations10,11, shown as background bands in Fig. 1. The consistency in sign of hot-cold phase covariance elements between observed proxy measures and their simulation-derived counterparts is an encouraging sign of fidelity in the sophisticated astrophysical treatments employed to model the coupled evolution of multiple baryon components at sub-cluster scales in modern cosmological simulations23. A consistent feature of such simulations is that the mean baryon fraction measured within the characteristic R500 length scale used in this work approaches the cosmic value as system mass increases. This aspect, along with a reduction in the population variance in baryon component mass fractions, supports the fair sampling hypothesis and allows the most massive clusters to serve as cosmic distance rulers24.

### Property covariance between hot gas observables

Figure 2 shows posterior constraints of correlations among different hot gas properties. Our results (shaded bands) of mainly strong positive correlations are consistent with both hydrodynamic simulation expectations25,26 and previous empirical measurements16,17,27. Our constraints are broader in scope, i.e., a larger number of measurements of the ICM including YX, YSZA, and YPl, and in most cases, more precise than the few existing estimates.

### Uncertainties and systematics

Due to the modest sample size, the uncertainty on any individual correlation coefficient remains large. Examining the upper triangle of Table 2, we find that the pairing of LK,tot and LX,ce is the strongest indicator of anti-correlation, with only a 1.8% chance of being zero or positive. As noted in Fig. 3 of the “Methods” section, the odds of a positive LK,totLX,ce correlation drop below one percent if the intrinsic scatter in LK,tot at fixed halo mass is larger than 0.07. For the λ measure of stellar mass, the evidence is somewhat weaker, with a 9% chance that it correlates positively with LX,ce at fixed halo mass.

It has previously been argued28,29 that the K-band integrated light is a more accurate indicator of a cluster’s total stellar mass than the number of optically-selected galaxies, i.e., optical richness. Our results appear to reinforce this finding, as the anti-correlations for LK,tot and X-ray properties are systematically more negative than those inferred using λ. But, as noted above, underestimation of the statistical uncertainty in λ could also play a role in diluting λ correlations30 (see Supplementary Figs. 2 and 3).

The findings of anti-correlation using core-excised X-ray luminosity are reinforced by the derived gas mass, Mgas. Again, the infrared light provides a tighter constraint, with only a 6% chance of being zero or positive, while the odds rise to 30% when using λ. In the companion paper, we note that the slope of the Mgas scaling with halo mass is ~2.5σ lower than values derived by previous studies based solely on X-ray observations and also slopes inferred from modern hydrodynamic simulations. A bias in slope could dilute the anti-correlation signal and explain why LX,ce provides more significant evidence of anti-correlation (Supplementary Fig. 1).

## Discussion

Property covariance has been forecast to significantly improve the power of joint, multi-wavelength survey analysis, especially in the case of anti-correlated properties31. This work helps set the stage for such analysis by providing initial estimates of stellar and hot gas covariance and refined estimates of a larger number of property correlations. While statistical errors in our correlation estimates are currently large, the coming decades will see an explosion of multi-wavelength cluster data from wide-area surveys such as Large Synoptic Survey Telescope (LSST), Euclid, e-ROSITA, and the Stage-4 ground-based cosmic microwave background experiment (CMB-S4). These upcoming samples will allow a better understanding of the physics and feedback effects that regulate the ICM along with improved cosmological constraints from joint sample analysis.

For example, the application of cluster gas fraction as a standard ruler32 could benefit from the inclusion of stellar mass estimates. Just as lightcurve shape and color corrections are used to improve the quality of Type Ia supernovae distances, so could stellar mass measurements be used to derive a lower scatter distance proxy than provided by the gas fraction alone.

On the computational side, the sensitivity of cluster property population statistics to modeling treatments will become more apparent as computational advances in multi-phase plasma astrophysics enable refinements of processes at sub-resolution scales. Synthetic observations of halo populations produced along past lightcones under model-specific conditions, when mapped through survey-specific observational filters, offer a pathway for likelihood testing of increasingly sensitive, multi-wavelength observational surveys. A new era has arrived for studies of clusters of galaxies as a population, one in which astrophysics-dependent population statistics realized from simulations are tested against corresponding multi-wavelength, empirical data, with outcomes driving improvements in next generation models. Our results reinforce the discovery power of applying population statistical analysis to galaxy cluster samples with complete, uniform multi-wavelength observations that probe hot and cold phase baryons and total mass.

## Methods

### Cosmology and notation

We assume a universe with dimensionless energy densities at the current time in total matter (baryons plus dark matter) Ωm = 0.3 and vacuum energy ΩΛ = 0.7, with Hubble constant H0 = 70 kms−1 Mpc−1. The Hubble expansion rate is normalized via $$E(z) \equiv H(z){\mathrm{/}}H_0 = \sqrt {{\mathrm{\Omega }}_{\mathrm{m}}(1 + z)^3 + {\mathrm{\Omega }}_{\mathrm{\Lambda }}}$$. For the halo population, we employ a mass scale convention, M500, defined as the mass within a sphere, of radius r500, within which the mean enclosed density is 500ρcrit(z), where ρcrit(z) = 3H(z)2/8πG, is the critical density of the universe1. Unless stated otherwise, the weak-lensing determined radius, r500,WL, defines the aperture within which integrated observable properties are derived.

### A multi-wavelength vector of observables

Here we describe the data vector employed in this study.

We study 41 X-ray bright clusters of the LoCuSS sample derived from RASS33,34,35. The sample is selected by redshift-dependent thresholds of X-ray luminosity, LX,RASSE−1(z) > 4.4 × 1044 ergs−1 for clusters between 0.15 < z < 0.24 and LX,RASSE−1(z) > 7.0 × 1044 ergs−1 for 0.24 < z < 0.30. For each cluster nine additional properties, listed in Table 1, have been measured. Details are provided in the companion paper19. The sample is complete in most, but not all, properties, as detailed below. The integrated observables can be grouped into three distinct sets: (i) a weak-lensing mass estimate of total system mass; (ii) quantities associated with the hot intracluster gas, and; (iii) quantities associated with stellar properties. We briefly describe each set as follows.

Using deep, multi-band optical images from Subaru/Suprime-Cam, a mass estimate for each cluster is derived by fitting the shear signal expected from weak gravitational lensing of a projected Navarro–Frenk–White (NFW, ref. 36) mass density profile to the measured tangential shear pattern18.

Properties of the hot gas content of clusters are mostly observable at X-ray and millimeter wavelengths. We use X-ray measurements of the ICM derived in ref. 37, where the selected sample has been observed with either or both of the Chandra and XMM-Newton X-ray observatories.

To avoid contamination from the complex cool core region, measurements of bolometric luminosity, LX, and gas temperature, TX, are performed in an annulus of [0.15–1]r500,WL. The gas mass, Mgas, is estimated from the observed X-ray emission profile within r500,WL. The SZ effect, caused by the inverse Compton scattering of cosmic microwave background (CMB) photons by hot electrons in the ICM12, is characterized by the parameter Y, which is proportional to the integrated electron thermal energy. The SZ effect from CMB intensity maps, YSZ, is measured via interferometry with SZA and independently with spectral filtering of Planck satellite data. A third estimate of the integrated electron thermal energy, YX, is derived from the X-ray observations as the product of gas mass, Mgas, and temperature, TX. This quantity is measured within its own iteratively-defined r500, as discussed in ref. 19.

We employ two independent measures of the stellar content of clusters, the total near-infrared (NIR) luminosity, LK, and a count of red-sequence galaxies, λ, referred to as optical richness. The NIR luminosity measurements, obtained with the WFCAM instrument on the UKIRT telescope29, determine the background-subtracted light within the weak-lensing estimated radius, r500, of each cluster, as well as LK of the BCG. NIR data is missing for one cluster (Abell2697). The optical richness, λ, a measure of the number of red-sequence galaxies within the cluster used by the redMaPPer cluster detection algorithm38, is determined for 33 clusters in the overlap region of the LoCuSS sample and the Sloan Digital Sky Survey (SDSS, ref. 39).

### Regression model

We assume a log-normal probability distribution of cluster properties with mean values that scale as a power-law in halo mass and E(z). Because of the narrow redshift range of the LoCuSS sample, we assume standard, self-similar evolution in redshift. We employ a hierarchical Bayesian inference model that accounts for the sample selection truncation, measurement error covariance and intrinsic property covariance. An additional component of this inference model is a prior function on true halo masses derived from the halo mass function in the reference ΛCDM cosmology with σ8 = 0.8. The performance of this method to recover input scaling relations of synthetic, truncated samples is demonstrated in the companion paper19.

The key element of our model is the conditional joint property likelihood20, p(S|Mhalo, z), of a vector of observables, S (elements in Table 1), given the true mass of the halo, Mhalo, at redshift, z. For the LoCuSS sample clusters, we assume that the cluster weak-lensing mass, MWL, is an unbiased measure of Mhalo with 20% fractional scatter. Our method returns posterior estimates of the intercepts, slopes, and intrinsic variance of each property element as a function of the cluster weak-lensing mass, along with the covariance of pairs of observables. The latter is assumed to be independent of mass and redshift within the narrow ranges probed by the LoCuSS sample. Uninformative priors are used in the analysis.

Using natural logarithms of the properties, s = lnS, and mass, μ = lnMhalo, the log-mean scaling of observable a at a fixed redshift is linear

$$\langle s_a{\kern 1pt} |{\kern 1pt} \mu ,z\rangle = \pi _a + \alpha _a\mu ,$$
(1)

in which αa and πa are the slope and normalization of the scaling relation of property a.

For a pair of observables, a and b, the intrinsic property covariance matrix is

$$C_{a,b} = \frac{N}{{N - 1}}\mathop {\sum}\limits_{i = 1}^n \,\delta s_{a,i}\,\delta s_{b,i},$$
(2)

where δsa,i ≡ sa,i − αaμi − πa is the residual deviation from the mean scaling relation and N is the total number of clusters. Finally the property correlation coefficient is

$$r_{a,b} = \frac{{C_{a,b}}}{{\sqrt {C_{a,a}\,C_{b,b}} }}.$$
(3)

This correlation coefficient is the quantity of interest that is studied in this letter. Our method constrains these correlation coefficients and the scaling parameters simultaneously, while including a covariance contribution from the reported measurement errors of the properties. Further details are provided in the companion paper19 that discusses mean scaling behaviors and property variance. This paper presents off-diagonal property covariance terms, except for LX,RASS correlations which are presented in the companion paper.

### Statistical significance and scatter in K-band luminosity

In the companion paper, we show that the posterior constraints on the intrinsic scatter in LK,tot are not bounded from below; values near zero are not only allowed by the data but the modal value of the posterior probability density function (PDF) is zero. The correlation coefficients between LK,tot and other properties vary substantially as the scatter in ln LK,tot drops to very low values. Very small values of this scatter, $$\sigma _{{\mathrm{ln}}L_K|M}$$, are not physically reasonable. Cosmological hydrodynamics simulations have found values of $$\sigma _{{\mathrm{ln}}L_K|M}\ > \ 0.10$$11, 0.3210, or 0.1640; and a recent observational study estimates a value of 0.22 ± 0.0441.

The confidence intervals and statistical significance of the anti-correlation signals reported in the main text employ a lower limit of $$\sigma _{{\mathrm{ln}}L_K|M} = 0.05$$. This choice is a conservative one, two times smaller than the smallest value reported above. We discard any point in the posterior chain with $$\sigma _{{\mathrm{ln}}L_K|M} \ < \ 0.05$$. All numbers reported in the main text are based on this truncated posterior distribution. For the sake of symmetry, we also impose the same limit on the richness scatter, σlnλ|M ≥ 0.05, but this has a much smaller effect as the posterior PDF has very little support in this region.

The statistical significance of the hot-cold baryon phase anti-correlation reported here is sensitive to the choice of minimum value for the stellar mass scatter. Figure 3 illustrates the odds of having a positive correlation for an optical and LX observable changes as a function of the imposed minimum value of $$\sigma _{{\mathrm{ln}}M_{{\mathrm{star}}}|M}$$. For a minimum value of 0.1, the odds of a positive correlation between LK and LX at fixed halo mass are 0.006, or roughly three-σ evidence. The odds that both optical measures correlate positively with LX,ce is very small, 0.005 for our fiducial minimum of 0.05 in $$\sigma _{{\mathrm{ln}}M_{{\mathrm{star}}}|M}$$.