The science driver for massive galaxy spectroscopic surveys is to extract cosmological information from the clustering of galaxies in the past lightcone. Baryon acoustic oscillations (BAO)1, formed in the early Universe due to interactions between photons and baryons under pressure and gravity, yield a special clustering pattern of galaxies around a characteristic comoving scale around 150 Mpc, which is one of key probes for dark energy2,3. The increasing size of galaxy redshift surveys over the decade 2000–2010 led ultimately to a 5σ detection of BAO by the Baryon Oscillation Spectroscopic Survey (BOSS)4. This enabled the BAO to be used as an accurate standard ruler to measure the geometry of the Universe and constrain the cosmic expansion history. A compilation of results from the Sloan Digital Sky Survey (SDSS) galaxy survey recently demonstrated the power of this technique5.

The BAO feature is generally blurred by the nonlinear evolution of the Universe reducing its strength as a standard ruler, and various reconstruction methods have been developed to sharpen the BAO peak by undoing the nonlinear evolution of the density field. The commonly used Lagrangian reconstruction, for example, linearises the density field by shifting the galaxies using the displacement field6,7,8, while for the Eulerian reconstruction, manipulation is performed at the field level without moving the galaxies9.

Although designed to boost the BAO signal originally, the reconstruction method can, in principle, also better extract the general cosmological information from the clustering. For example, the redshift space distortions (RSD)10,11,12, which is caused by peculiar motions of galaxies under gravity, can also be better constrained using the reconstructed sample13.

The standard method for BAO-reconstruction alters the over-density field so that it is more correlated with the initial linear field14, and the level of mode-coupling can be highly reduced15. It achieves this by inferring the bulk-flows using the observed galaxy field and then removing these displacements from both the galaxy positions and the map of expected density setting the baseline from which the over-densities are found. The power spectrum of the post-reconstructed sample (Ppost) provides additional information for cosmology compared with the pre-reconstructed sample (\({P}_{{{{{{{{\rm{pre}}}}}}}}}\)) because the reconstruction restores the linear signal reduced by the non-linear evolution.

The higher order statistics such as B, the bispectrum, induced by the non-linear evolution are, in turn, reduced. Thus we can extract more cosmological information encoded in the linear density field from Ppost than \({P}_{{{{{{{{\rm{pre}}}}}}}}}\). On the other hand, the non-linear field contains information on small-scale clustering such as galaxy biases, which can provide better constraints on cosmological parameters by breaking the degeneracy between them. For the pre-reconstructed sample, this information can be extracted by combining the power spectrum with higher-order statistics. However, the power spectrum and higher order statistics such as the bispectrum are correlated, reducing our ability to estimate cosmological parameters. If we instead consider the post-reconstructed power spectrum, the covariance between the power spectrum and the bispectrum is reduced and we can extract the information more efficiently. In this work, we show that the same improvement can be achieved by a joint analysis of \({P}_{{{{{{{{\rm{pre}}}}}}}}}\), Ppost and Pcross (the cross-power spectrum between the pre- and post-reconstructed density fields). Due to the restored linear signal in the reconstructed density field, Ppost is decorrelated with \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) on small scales, which are dominated by the non-linear information. On these scales, the combination of \({P}_{{{{{{{{\rm{pre}}}}}}}}}\), Ppost and Pcross has a similar ability to extract cosmological information as the combination of Ppost or \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) with the bispectrum because we are able to use the linear information in Ppost and higher-order information in \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) separately.

Let us rewrite the non-linear over-density field as δ = R + Δ, where R is the over-density field after reconstruction, which is closer to the linear field. It is then straightforward to express \({P}_{{{{{{{{\rm{pre}}}}}}}}}\), Pcross in terms of PRR(=Ppost), PΔΔ (the power spectrum of Δ) and PΔR (the cross-power spectrum between Δ and R). Using perturbation theory16, we can show that, at the leading order, PΔΔ contains the integrated contribution from the bispectrum of squeezed-limit triangles while PRΔ contains the integrated contribution from the trispectrum (T) of folded/squeeze-limit quadrilaterals (See Supplementary Note 1 for an explanation). In this fashion when combining \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Pcross with Ppost, we are essentially adding in higher-order signal, thus naturally gaining information. Note that in order to match the information obtained by adding these two extra statistics, it is not enough to consider the bispectrum signal of the pre-reconstructed field, but both bispectrum and trispectrum signals. For this reason the information content of \({P}_{{{{{{{{\rm{pre}}}}}}}}}+B\) is different from that contained in \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{P}_{{{{{{{{\rm{cross}}}}}}}}}+{P}_{{{{{{{{\rm{post}}}}}}}}}\). However, it is important to note that higher-order information that reconstruction brings is only a part from the total contained in the full bispectrum and trispectrum data-vectors. This is why a full analysis using P + B + T will always provide more information. However, such an analysis is not very practical because of the size of the full data-vector and the computational time typically required to measure B and especially T directly. In this paper we show that \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{P}_{{{{{{{{\rm{cross}}}}}}}}}+{P}_{{{{{{{{\rm{post}}}}}}}}}\) is an efficient alternative for extracting the relevant information from higher-order statistics for cosmological analyses.

To demonstrate the power of jointly using density fields before and after the reconstruction, we perform an anisotropic Lagrangian reconstruction (see Methods for details) on each realisation of the Molino galaxy mocks17, which is a large suite of realistic galaxy mocks produced from the Quijote simulations18 at z = 0. We then use these mocks to calculate the data covariance matrix and derivatives numerically for a Fisher matrix analysis19 using the measured multipoles (up to  = 4) of \({P}_{{{{{{{{\rm{pre}}}}}}}}}\), Ppost and Pcross on the parameter set Θ ≡ {Ωm, Ωb, h, ns, σ8, Mν, H} where H denotes the Halo Occupation Distribution (HOD) parameters, i.e., \({{{{{{{\bf{H}}}}}}}}\equiv \{\log {M}_{\min },{\sigma }_{\log M},\log {M}_{0},\alpha ,\log {M}_{1}\}\)20 (see Methods for details).

Results and discussion

Panel a in Fig. 1 shows the measured power spectra monopole (the quadrupule and hexadecapole are shown in Supplementary Fig. 1), and we see that Pcross decreases dramatically with scale compared to \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost. This indicates a decorrelation between \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost below quasi-nonlinear scales (k 0.1 hMpc−1), which is largely due to the difference in infra-red effects contained in density fluctuations before and after the BAO reconstruction21.

Fig. 1: The measured power spectrum monopole.
figure 1

a The monopole (multiplied by k) of three types of power spectra indicated in the legend, measured from the Molino galaxy mocks; b The rotated power spectra monopole defined in Eq. (1). In both panels, the lines in the centre denote the mean of the mocks and the shades represent the 68% confidence level uncertainty.

From the original data vector \(\{{P}_{{{{{{{{\rm{pre}}}}}}}}},{P}_{{{{{{{{\rm{post}}}}}}}}},{P}_{{{{{{{{\rm{cross}}}}}}}}}\}\), we can construct their linear combinations, P, PΔΔ, PδΔ defined as

$$\begin{array}{r}{P}_{{{{{{{{\rm{R}}}}}}}}{{\Delta }}}={P}_{{{{{{{{\rm{cross}}}}}}}}}-{P}_{{{{{{{{\rm{post}}}}}}}}}\,;{P}_{{{\Delta }}{{\Delta }}}={P}_{{{{{{{{\rm{pre}}}}}}}}}+{P}_{{{{{{{{\rm{post}}}}}}}}}-2{P}_{{{{{{{{\rm{cross}}}}}}}}}\,;{P}_{{{{{{{{\rm{\delta }}}}}}}}{{\Delta }}}={P}_{{{{{{{{\rm{pre}}}}}}}}}-{P}_{{{{{{{{\rm{cross}}}}}}}}}.\end{array}$$

Figure 1b shows these power spectra. As discussed above, these power spectra involving Δ contain the information of part of the high-order statistics such as bispectrum and trispectrum.

The derivatives of \(\{{P}_{{{{{{{{\rm{pre}}}}}}}}},{P}_{{{{{{{{\rm{post}}}}}}}}},{P}_{{{{{{{{\rm{cross}}}}}}}}}\}\) with respect to cosmological parameters and HOD parameters are presented in Supplementary Figs. 24. We have checked and confirmed the convergence of our Fisher matrix result given the number of mocks available, demonstrating the robustness of our result (see Supplementary Note 2 and Supplementary Fig. 5 for details).

The correlation matrix for the monopole of power spectrum and bispectrum (only the correlation with the squeezed-limit of B0 is visualised for brevity) is shown in Fig. 2. It is seen that \({P}_{0}^{{{{{{{{\rm{pre}}}}}}}}}\) highly correlates with B0, confirming that the bispectrum is induced by nonlinearities. In contrast, \({P}_{0}^{{{{{{{{\rm{post}}}}}}}}}\) weakly correlates with B0, or with \({P}_{0}^{{{{{{{{\rm{pre}}}}}}}}}\) and \({P}_{0}^{{{{{{{{\rm{cross}}}}}}}}}\) on nonlinear scales (e.g., at k 0.2 hMpc−1). This, however, does not mean that Ppost is irrelevant to the bispectrum—it actually is a mixture of \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and certain integrated forms of the bispectrum and trispectrum information9,16. Therefore by combining Ppost with \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Pcross, one can in principle decouple the leading contribution in the power spectrum, bispectrum and trispectrum. The integrated form of the bispectrum information dominates PΔΔ, which strongly correlates with B0, as shown in Supplementary Fig. 6 (see Supplementary Note 3). The fact that Ppost barely correlates with B0 implies that the information content in Ppost combined with B0 may be similar to that in Ppost combined with \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Pcross, which is confirmed to be the case by the results from the Fisher analysis presented below.

Fig. 2: Part of the correlation matrix between the power spectra and bispectrum.
figure 2

The correlation matrix for the monopole of three types of power spectra (\({P}_{0}^{{{{{{{{\rm{pre}}}}}}}}},{P}_{0}^{{{{{{{{\rm{post}}}}}}}}},{P}_{0}^{{{{{{{{\rm{cross}}}}}}}}}\)), and of the bispectrum in the squeezed limit (\({B}_{0}^{{{{{{{{\rm{SL}}}}}}}}}\)), i.e. k1 = k2k3, derived from the Molino galaxy mocks. The horizontal and vertical lines separate each block for visualisation. For all blocks, the associated k or k1 increases from 0.01 to 0.5 hMpc−1, from left to right and from bottom to top.

The cumulative signal-to-noise ratio (SNR) for power spectrum multipoles (up to  = 4), bispectrum monopole and various data combinations are shown in Supplementary Fig. 7, in which we can see that the joint 2-point statistics, Pall, is measured with a greater SNR than that of Ppost or P + B0, which may mean that Pall can be more informative than Ppost or P + B0 for constraining cosmological parameters.

To confirm the constraining power of Pall, we then project the information content in the observables onto cosmological parameters using a Fisher matrix approach. Contour plots for \((\log {M}_{{{{{{{{\rm{0}}}}}}}}},{\sigma }_{8})\) derived from different datasets with two choices of \({k}_{\max }\) (the maximal k for the observables used in the analysis) are shown in Fig. 3 (more complete contour plots are shown in Supplementary Figs. 812). The smoothing scale is set to be S = 10 h−1Mpc when performing the reconstruction. The degeneracies between parameters using \({P}_{{{{{{{{\rm{pre}}}}}}}}},{P}_{{{{{{{{\rm{post}}}}}}}}}\) and Pcross are generally different, because \({P}_{{{{{{{{\rm{pre}}}}}}}}},{P}_{{{{{{{{\rm{post}}}}}}}}}\) and Pcross differ to a large extent in terms of nonlinearity on small scales. This is easier to see in Supplementary Fig. 8, in which contours for the same parameters are shown for observables used in several k intervals. The contours derived from \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost generally rotate as k increases because of the kick-in of nonlinear effects, which affects \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost at different levels on the same scale. This significantly improves the constraint when these power spectra are combined, labelled as Pall, which is tighter than that from the traditional joint power spectrum-bispectrum analysis (\({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\)). It is found that Pall can even win against Ppost + B0 in some cases, demonstrating the robustness of this method. The contour plots with 1D posterior distributions for all parameters with S = 10 and 20 h−1Mpc and \({k}_{\max }=0.2\) and 0.5 hMpc−1 are shown in Supplementary Figs. 912, respectively. In all cases, Pall offers competitive constraints on all parameters, even compared to the joint Ppost + B0 analysis.

Fig. 3: The 68% confidence level contour plots on σ8 and \(\log {M}_{0}\) derived from various data combinations.
figure 3

a, b Show the constraint with \({k}_{\max }=0.2\) and 0.5 hMpc−1, respectively. In each panel, the constraints are from the pre-reconstructed power spectrum (\({P}_{{{{{{{{\rm{pre}}}}}}}}}\)) alone (grey dashed line), post-reconstructed power spectrum (Ppost) alone (dark blue dash-dotted line), cross power spectrum between the pre- and post-reconstructed density fields (Pcross) alone (green dotted line), the combination of pre-, post-reconstructed and cross-power spectra (Pall) (red solid line), the combination of pre-reconstructed power spectrum and bispectrum (\({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{{{{{{{{\rm{0}}}}}}}}}\)) (light blue dash-dot-dotted line), the combination of post-reconstructed power spectrum and bispectrum (Ppost + B0) (grey filled region) and the combination of Pall and bispectrum (Pall + B0) (purple solid line).

To further quantify our results, in Fig. 4 we compare the square root of the Fisher matrix element for each parameter, with and without marginalising over others, derived from Pall and \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\), respectively, with two choices of \({k}_{\max }\).

Fig. 4: The Figure of Merit (FoM) of each individual parameter derived from all power spectrum combined, rescaled by those derived from \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\).
figure 4

The quantities X in (a) and Y in (b) are defined as the FoM of each individual parameter with or without all other parameters fixed. Specifically, \(X\equiv \sqrt{{F}_{ii}}\) and \(Y\equiv 1/\sqrt{{C}_{ii}}\) where F is the Fisher matrix and C ≡ F−1. D0 and D1 denote Pall and \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\) respectively. The dark and light grey bars in each panel show the cases with k = 0.2 and 0.5 hMpc−1, respectively. The vertical dashed lines show a full recovery of information from dataset \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\). The smoothing scale is set to be 10 h−1Mpc when performing the reconstruction.

For \({k}_{\max }=0.2\,h\,{{{{{{{{\rm{Mpc}}}}}}}}}^{-1}\), we see that the Fisher information for each parameter (panel a: without marginalising over others) derived from Pall is identical or even greater than that in \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\). In other words, combining all power spectra we can efficiently extract the information in \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\). After marginalising over other parameters, panel b shows that the uncertainty on each parameter gets redistributed due to the degeneracy. The ratios for the HOD parameters are all greater than unity especially for \(\log {M}_{0}\) and \({\sigma }_{\log M}\), demonstrating the power of our method on constraining HOD parameters. The information content for cosmological parameters in \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\) is well recovered by using Pall, although the recovery for Mν is relatively worse. The overall trend for the case of \({k}_{\max }=0.5\,h\,{{{{{{{{\rm{Mpc}}}}}}}}}^{-1}\) is similar, although the advantage of using Pall over \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\) gets degraded to some extent. However, Pall is still competitive: it almost fully recovers the information for the HOD parameters in \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\) with or without marginalisation, and largely wins against \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\) after marginalisation. Regarding the cosmological parameters, Pall recovers all information in \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\) before the marginalisation, although the recovery is slightly worse for Mν. After marginalisation when the uncertainties are redistributed, the constraint from Pall is generally worse than \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\), especially for Mν.

The 68% confidence level constraints on each parameter fitting to various datasets are shown in Table 1. To quantify the information gain, we evaluate the Figure-of-Merit (FoM) defined as \({\left[{{{{{{\mathbf{\ det}}}}}}} (F)\right]}^{1/(2N_{{{{{{\rm{p}}}}}}})}\), where F denotes the Fisher matrix and Np is the total number of free parameters. For the ease of comparison, for cases with different \({k}_{\max }\), we normalise all the quantities using the corresponding one for \({P}_{{{{{{{{\rm{pre}}}}}}}}}\). As shown, for \({k}_{\max }=0.2\,h\,{{{{{{{{\rm{Mpc}}}}}}}}}^{-1}\), \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{all}}}}}}}}}}\) is greater than all others, namely, it is larger than \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{pre}}}}}}}}}}\) and \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{post}}}}}}}}}}\) by a factor of 2.7 and 1.7, respectively and it is even greater than \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{post}}}}}}}}}+{B}_{0}}\) by ~13%. For \({k}_{\max }=0.5\,h\,{{{{{{{{\rm{Mpc}}}}}}}}}^{-1}\), \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{all}}}}}}}}}}\) is also more informative than \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{pre}}}}}}}}}}\) and \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{post}}}}}}}}}}\) by a factor of 2.1 and 1.5, respectively and is the same as \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}}\), but is less than \({({{{{{{{\rm{FoM}}}}}}}})}_{{P}_{{{{{{{{\rm{post}}}}}}}}}+{B}_{0}}\) by ~10% in this case.

Table 1 A quantification of the information content extracted from various observables measured from the Molino mocks

To highlight the constraining power on cosmological parameters, we also list \({{{{{{{{\rm{FoM}}}}}}}}}_{\cos }\), which is the FoM with all HOD parameters fixed. It shows a similar trend as FoMΘ: Pall is the most informative data combination for \({k}_{\max }=0.2\,h\,{{{{{{{{\rm{Mpc}}}}}}}}}^{-1}\), but it is outnumbered by \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\) and Ppost + B0 by 13% and 30%, respectively, for the case of \({k}_{\max }=0.5\,h\,{{{{{{{{\rm{Mpc}}}}}}}}}^{-1}\).


As demonstrated in this analysis, a joint analysis using \({P}_{{{{{{{{\rm{pre}}}}}}}}},{P}_{{{{{{{{\rm{post}}}}}}}}}\) and Pcross is an efficient way to extract high-order information from galaxy catalogues, and in some cases, Pall is more informative even than Ppost + B0, which is computationally much more expensive.

In this example, the k-binning for P and B are different, namely, Δk(B) = 3kf ~ 0.019 hMpc−1 ~ 1.9Δk(P) where kf  denotes the fundamental k mode given the box size of the simulation. We have checked that using a finer k-binning for B only improves the constraints marginally22, namely, the FoM can only be raised by ~10% when Δk(B) is reduced from 3kf to kf, which is largely due to the strong mode-coupling in B as shown in Fig. 2. Such a fine binning is not practical anyway as, for example, using Δk(B) = kf up to k = 0.5 hMpc−1, we end up with more than 50,000 data points to measure for B0.

Note that the Molino mock is produced at z = 0, where the nonlinear effects are the strongest. At higher redshifts, the density fields are more linear and Gaussian, thus we may expect less gain from our method. This can be seen from panel a of Supplementary Fig. 13, in which the correlation between \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost at various redshifts is shown. As expected, \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost are more correlated at higher redshifts, e.g., the correlation approaches 0.95 at z = 5 around k ~ 0.3 hMpc−1, which implies that almost no information gain can be obtained at such high redshifts. As argued previously, the decorrelation at lower z is due to the fact that \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost contain different levels of nonlinearity, as illustrated in panel b, thus are complementary. Also, the Alcock-Paczyński (AP) effect23, which is a geometric distortion due to the discrepancy between the true cosmology and the fiducial one used to convert redshifts to distances, is irrelevant at z = 024. As studied25, the AP effect can make the small-scale bispectrum more informative for constraining the standard ruler than the power spectrum (\({P}_{{{{{{{{\rm{pre}}}}}}}}}\)), thus it is worth revisiting the case in which Ppost and Pcross are added to the analysis.

To further demonstrate the efficacy of our method, we perform another analysis at a higher redshift using P and B0 from an independent set of mocks: a suite of 4000 high-resolution N-body mocks (5123 particles in a box with 500 h−1Mpc a side) produced at z = 1.02. This allows us to include the AP effect when performing the BAO and RSD analysis. This test confirms that \({P}_{{{{{{{{\rm{pre}}}}}}}}},\,{P}_{{{{{{{{\rm{post}}}}}}}}}\) and Pcross are complementary for constraining cosmological parameters, and that Pall contains almost all the information in P combined with B0, which is consistent with our findings from the Molino analysis (see Methods and Supplementary Figs. 1619 for more details).

Stage-IV redshift surveys including the Dark Energy Spectroscopic Instrument (DESI)26, Euclid27 and the Prime Focus Spectrograph28 will release galaxy maps over a wide range of redshifts with an exquisite precision. As long as the distribution of a tracer in a given redshift range is not too sparse, namely, the number density is not lower than 10−4 h3Mpc−3 so that a reconstruction can be efficiently performed29, the method presented in this work can be directly applied to extract high order statistics for constraining cosmological parameters from 2-point measurements, which is computationally much more efficient to perform. Since the reconstruction will be performed anyway for most ongoing and forthcoming galaxy surveys to improve the BAO signal, our proposed analysis can be performed at almost no additional computational cost.

Additional work is required to build a link between cosmological parameters to the full shape of power spectra for a likelihood analysis, and this is challenging using perturbation-theory-based models on (quasi-) nonlinear scales, especially for the reconstructed power spectrum and the cross power spectrum. However, model-free approaches including the simulation-based emulation30,31,32, can be used for performing the Pall analysis down to nonlinear scales, in order to extract the cosmological information from the power spectra to the greatest extent. The emulator-based Pall analysis was recently performed and validated33, which well demonstrates the idea proposed in this work.


The mock catalogues—Molino galaxy mocks at z = 0

The Molino catalogues17 are a suite of publicly available galaxy mock catalogues that were constructed to quantify the total cosmological information content of different galaxy clustering observables using Fisher matrix forecasting. They are constructed from the Quijote suite of N-body simulations18 using the halo occupation distribution (HOD) framework. HOD provides a statistical prescription for populating dark matter halos with central and satellite galaxies and has been successful in reproducing a wide range of observed galaxy clustering statistics. In particular, the Molino catalogues use the standard HOD model20, which has five free parameters: \(\left\{\right.\log {M}_{\min },{\sigma }_{\log M},\log {M}_{0},\alpha ,\log {M}_{1}\left.\right\}\). Molino includes 15, 000 galaxy catalogues that are constructed at a fiducial set of cosmological parameters (Ωm = 0.3175, Ωb = 0.049, h = 0.6711, ns = 0.9624, σ8 = 0.834, Mν = 0) and HOD parameters (\(\log {M}_{\min }=13.65,{\sigma }_{\log M}=0.2,\log {M}_{0}=14.0,\alpha =1.1,\log {M}_{1}=14.0\)), which are based on the best-fit HOD parameters for the SDSS Mr < −21.5 and −22 samples20.

The 15, 000 Molino mocks for the fiducial cosmology are designed for accurately estimating the covariance matrices of the galaxy clustering observables, including the power spectra and bispectra. In addition, a separate set of the Molino mocks are produced for estimating the derivatives with respect to cosmological parameters (including the HOD ones) using the finite difference method (see Supplementary Note 2 for details). For this purpose, 60,000 galaxy mocks are constructed at 24 cosmologies that are slightly different from the fiducial one17.

Since the data covariance matrices and the derivatives are all evaluated numerically using mocks, it is important to ensure that the result derived from the Fisher matrix approach is robust against numerical issues, as argued in17,34,35,36. We, therefore, perform numerical tests to check the dependence of our Fisher matrix calculation on the number of mocks and find that the marginalised uncertainties of all the concerning parameters are well converged given the number of mocks available. The details are presented in the Supplementary Note 2.

The mock catalogues—4000 high-resolution N-body mocks at z = 1.02

To confirm our findings from the Molino mocks, we perform an independent mock test on a suite of 4000 high-resolution N-body simulations with 5123 dark matter particles in a L = 500 h−1Mpc box at z = 1.0213. The fiducial cosmology used for this set of mocks is consistent with the Planck 201537 observations.

The mock catalogues—COLA mocks at multiple redshifts

To investigate how the decorrelation between \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost varies with redshifts, we perform another set of N-body simulations using the COmoving Lagrangian Acceleration (COLA)38 method with the MG-PICOLA code39. The mocks are performed using 2563 dark matter particles in a L = 256 h−1Mpc box, and snapshots at z = 0, 1, 2, 3, 5, 10, 15 are analysed, to cover a sufficiently wide range of redshifts. Although the COLA mocks are approximate, the accuracy and reliability has been well demonstrated in the literature38,39,40.

The reconstruction process

An anisotropic reconstruction41 is performed on each realisation of the Molino galaxy mocks with two choices of the smoothing scale, S = 10 and 20 h−1Mpc (All results presented in the main text are for the 10 h−1Mpc case, while results for 20 h−1Mpc are shown in the Supplementary information). Specifically, a smoothing is performed by convolving density field with the kernel \(K(k)=\exp \left[-{(kS)}^{2}/2\right]\) in Fourier space. Note that in this procedure the information on scales below the smoothing scales gets erased, and there are studies on choosing the proper smoothing scale15. In principle, the smoothing scale can be made sufficiently small to restore more information, for example, no smoothing is needed at all in the nonlinear reconstruction methods14 and we will apply our pipeline to those reconstruction schemes for further investigation. After the smoothing, the displacement vector is solved using the Zeldovich approximation, i.e., \(\tilde{{{{{{{{\bf{s}}}}}}}}}({{{{{{{\bf{k}}}}}}}})=-\frac{i{{{{{{{\bf{k}}}}}}}}}{{k}^{2}}\frac{\delta ({{{{{{{\bf{k}}}}}}}})}{{b}_{{{{{{{{\rm{in}}}}}}}}}+{f}_{{{{{{{{\rm{in}}}}}}}}}{\mu }^{2}}K(k)\), where δ denotes the nonlinear redshift-space overdensity, bin and fin are the input linear bias and the logarithmic growth rate for the density field, respectively. Note that {bin, fin} does not have to be identical to the true underlying {b, f } of the density field, thus they are not free parameters to be determined. The post-reconstructed power spectrum for a given {bin, fin} can be modelled using either the perturbation theory42, or an emulation approach, as developed in ref. 33 It is true that an inappropriate choice of {bin, fin}, e.g., a set of {bin, fin} that is significantly different from the truth, may affect the efficiency of the BAO reconstruction, but the impact from using {bin, fin} can be well modelled and corrected for, so this process is not expected to generate bias or uncertainties.

To demonstrate that the result would not get biased by an inappropriate set of {bin, fin}, ref. 33 uses a significantly wrong set of {bin, fin} for the reconstruction, namely, {bin = 0.9b, fin = 0.7f }, where {b, f } are the true b and f of the density field. This level of deviation from the true value is greater than 3σ level, given the uncertainty of b and f constrained by the BOSS (DR12) survey43. The impact of using such a wrong set of {bin, fin} is corrected for by the properly trained emulator, and as demonstrated in Fig. 6 of ref. 33, using this set of {bin, fin} does not bias, or dilute the final parameter constraint. In summary, it is expected that the choice of {bin, fin} used in this work does not bias the result and a more in-depth assessment on the potential influence of the choice on {bin, fin} is left for a future study on a joint Pall analysis using the actual observational data.

An inverse Fourier transformation on \(\tilde{{{{{{{{\bf{s}}}}}}}}}\) returns the configuration-space displacement field s(x), which is used to move both the galaxies and randoms. We also perform the anisotropic Lagrangian reconstruction15 on each realisation of the N-body mocks, but only with a smoothing scale S = 10 h−1Mpc.

Note that the information content in the reconstructed power spectrum is the same no matter whether the RSD is kept or not during the reconstruction process, and we have numerically confirmed this by performing the analysis with the isotropic reconstruction15, in which the RSD is removed using the fidicual f and b used for producing the mocks.

Also note that the BAO reconstruction procedure is not always required for extracting geometric information in the galaxy clustering. For example, when using the information in the linear point44,45,46,47, no reconstruction is required. Also, the estimated α from the traditional BAO methods and from the linear point approach may conceptually differ and a comparison is beyond the scope of this work.

Measurement of the power spectrum multipoles

The multipoles (up to  = 4) of both the pre- and post-reconstructed density fields are measured using an FFT-based estimator48 implemented in N-body kit49. The shot-noise, which reflects the discreteness of the density field, is removed as a constant for the monopole of the auto-power. The k-binning is Δk = 0.01 hMpc−1 for both the Molino and N-body mocks.

Care needs to be taken when measuring the cross-power spectrum between the pre-and post-reconstructed density fields since the raw measurement using the FFT-based estimator is contaminated by a scale-dependent shot-noise: on large scales, the post-reconstructed field resembles the unreconstructed one, making the cross-power spectrum essentially an auto-power, thus it is subject to a shot-noise component. On small scales, however, the shot-noise largely drops because the two fields effectively decorrelate.

To obtain a measured cross-power spectrum whose mean value reflects the true power spectrum in the data such that no subtraction of the noise component is required, we adopt the half-sum and half-difference (HS-HD) approach50. We start by randomly dividing the catalogue into two halves, dubbed δ1 and δ2 and the corresponding reconstructed density fields are R1 and R2, respectively.


$${{{{{{{\rm{HS}}}}}}}}\equiv \frac{{\delta }_{1}+{\delta }_{2}}{2},\,\,{{{{{{{\rm{HD}}}}}}}}\equiv \frac{{\delta }_{1}-{\delta }_{2}}{2},$$


$${{{{{{{{\rm{HS}}}}}}}}}^{{{{{{{{\rm{R}}}}}}}}}\equiv \frac{{{{{{{{{\rm{R}}}}}}}}}_{1}+{{{{{{{{\rm{R}}}}}}}}}_{2}}{2},\,\,{{{{{{{{\rm{HD}}}}}}}}}^{{{{{{{{\rm{R}}}}}}}}}\equiv \frac{{{{{{{{{\rm{R}}}}}}}}}_{1}-{{{{{{{{\rm{R}}}}}}}}}_{2}}{2}.$$

Then HS(R) contains both the signal and noise, but HD(R) only contains the noise. Hence the cross-power spectrum estimator is,

$${\hat{P}}_{{{{{{{{\rm{cross}}}}}}}}}=\langle {{{{{{{\rm{HS}}}}}}}},{{{{{{{{\rm{HS}}}}}}}}}^{{{{{{{{\rm{R}}}}}}}}}\rangle -\langle {{{{{{{\rm{HD}}}}}}}},{{{{{{{{\rm{HD}}}}}}}}}^{{{{{{{{\rm{R}}}}}}}}}\rangle =\frac{\langle {\delta }_{1},{{{{{{{{\rm{R}}}}}}}}}_{2}\rangle +\langle {\delta }_{2},{{{{{{{{\rm{R}}}}}}}}}_{1}\rangle }{2}.$$

The scatter of \({\hat{P}}_{{{{{{{{\rm{cross}}}}}}}}}\) around the mean value allows for an estimation of the covariance matrix, which is a 4-point function51, shown in Fig. 2. By comparing \({\hat{P}}_{{{{{{{{\rm{cross}}}}}}}}}\) with that measured without splitting the samples, we can obtain the noise power spectrum, as shown in Supplementary Figs. 14, 15 (for cases with S = 10 and 20 h−1Mpc, respectively), which is apparently scale-dependent. The noise is anisotropic, and thus it affects even for multipoles with  ≠ 0.

Since a change in HOD parameters can result in a change in the number density of the galaxy sample and thus affect the shot-noise, the shot-noise can in principle be used to constrain the HOD parameters. We, therefore, perform an additional Fisher projection with the shot-noise kept in the spectra, and find that the constraints on HOD parameters can be improved in general, but the constraint on cosmological parameters is largely unchanged (see the \({P}_{{{{{{{{\rm{all}}}}}}}}}^{{{{{{{{\rm{SN}}}}}}}}}\) column in Table 1).

Measurement of the bispectrum monopole

We measure the galaxy bispectrum monopole, B0, for all of the mock catalogues using the publicly available pySpectrum package17,34. Galaxy positions are first interpolated onto a grid using a fourth-order interpolation scheme and then Fourier transformed to obtain δ(k). Afterwards B0 is estimated using

$${B}_{0}({k}_{1},{k}_{2},{k}_{3})= \frac{1}{{V}_{B}}\int_{{k}_{1}}{{{{{{{{\rm{d}}}}}}}}}^{3}{q}_{1}\int_{{k}_{2}}{{{{{{{{\rm{d}}}}}}}}}^{3}{q}_{2}\int_{{k}_{3}}{{{{{{{{\rm{d}}}}}}}}}^{3}{q}_{3}\,{\delta }_{{{{{{{{\rm{D}}}}}}}}}({{{{{{{\bf{{q}}}}}}}_{123}}})\,\delta ({{{{{{{\bf{{q}}}}}}}_{1}}})\,\delta ({{{{{{{\bf{{q}}}}}}}_{2}}})\,\delta ({{{{{{{\bf{{q}}}}}}}_{3}}})\\ -{B}_{0}^{{{{{{{{\rm{SN}}}}}}}}}$$

where δD is the Dirac delta function, VB is the normalisation factor proportional to the number of triplets that can be found in the k1, k2, k3 triangle bin and \({B}_{0}^{{{{{{{{\rm{SN}}}}}}}}}\) is the Poisson shot noise correction term. Triangle configurations are defined by k1, k2, k3 and for the Molino mocks, the width of the bins is Δk = 3kf, where kf = 2π/(1000 h−1Mpc) and for the N-body mocks, Δk = 0.02 hMpc−1.

An AP test performed on the Molino mocks

Although the AP effect plays no role for the Molino mock since it is produced at z = 024, we perform a test by isotropically stretching the scales and angles using pairs of AP parameters calculated at a non-zero redshift. This gives us an idea about whether this artificial and exaggerated AP effect can change the main conclusion of this work that the cosmological information content in Pall is almost the same as or more than that in \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\). In practice, we use the (α and α) pairs computed at zeff = 0.5 and 1.0 respectively to stretch the wave numbers along and across the line of sight directions and repeat the analysis. As shown in the Supplementary Table (see Supplementary Note 3), this added ‘artificial’ AP effects can generally tighten the constraint, but the relative constraints from Pall and \({P}_{{{{{{{{\rm{pre}}}}}}}}}+{B}_{0}\) are largely unchanged, meaning that the main conclusion of this paper remains the same if the AP effect is taken into account.

An AP test on the N-body mocks

We perform an additional Fisher matrix analysis19 on the AP parameters using 4000 realisations of N-body particle mocks produced at z = 1.02 in redshift space. Part of the observables (the power spectrum monopole) are shown in Supplementary Fig. 16. From panel a we see that the amplitude of Pcross decreases dramatically with scales, indicating a decorrelation between \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost below quasi-nonlinear scales, which is confirmed by the correlation coefficient (the normalised covariance) plotted in panel b. This decorrelation, which is not caused by the shot noise given the negligible noise level in the mocks, is a clear evidence of the complementarity among the power spectra.

The cumulative signal-to-noise ratio (SNR) is shown in panel a of Supplementary Fig. 17, in which we see that Pall is more informative than \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and that P + B0 has slightly higher SNR on small scales.

We first perform an AP test on the isotropic dilation parameter αiso, which is defined as the ratio of the true spherically-averaged scale of the standard ruler to the fiducial one. This dilation parameter depends on cosmological parameters, and can be constrained using the monopole of the power spectrum and bispectrum. The wavenumber k gets dilated by αiso due to the AP effect, thus the observables are,

$$\begin{array}{r}{P}_{0}^{T}({k}^{{\prime} })={A}_{0}\frac{1}{{\alpha }_{{{{{{{{\rm{iso}}}}}}}}}^{3}}{P}_{0}^{T}\left(k/{\alpha }_{{{{{{{{\rm{iso}}}}}}}}}\right)\,\end{array}$$
$$\begin{array}{r}{B}_{0}({k}_{1}^{{\prime} },{k}_{2}^{{\prime} },{k}_{3}^{{\prime} })={A}_{B}\frac{1}{{\alpha }_{{{{{{{{\rm{iso}}}}}}}}}^{6}}{B}_{0}({k}_{1}/{\alpha }_{{{{{{{{\rm{iso}}}}}}}}},{k}_{2}/{\alpha }_{{{{{{{{\rm{iso}}}}}}}}},{k}_{3}/{\alpha }_{{{{{{{{\rm{iso}}}}}}}}}),\end{array}$$

where T denotes the type of P0, namely, \(T=\{{{{{{{{\rm{pre}}}}}}}},{{{{{{{\rm{post}}}}}}}},{{{{{{{\rm{cross}}}}}}}}\}\), and the parameters A0 and AB are used to parameterise the overall amplitudes of power spectrum monopole and bispectrum monopole, respectively. Since the purpose of this test is to study the impact of AP parameters, the relevant parameters are {αiso, \(\ln {A}_{0}\), \(\ln {A}_{B}\)}, and these are free parameters in this calculation. Other parameters are held fixed to avoid confusion. The derivative with respect to αiso is evaluated semi-analytically as

$$\begin{array}{r}\frac{\partial {P}_{0}^{T}}{\partial \alpha }=-3{P}_{0}^{T}-\frac{d{P}_{0}^{T}}{d\ln k},\end{array}$$
$$\begin{array}{r}\frac{\partial {B}_{0}}{\partial {\alpha }_{{{{{{{{\rm{iso}}}}}}}}}}=-6{B}_{0}-\left(\frac{\partial {B}_{0}}{\partial \ln {k}_{1}}+\frac{\partial {B}_{0}}{\partial \ln {k}_{2}}+\frac{\partial {B}_{0}}{\partial \ln {k}_{3}}\right).\end{array}$$

Then the constraint on αiso is derived after marginalising over the amplitudes A0 and AB, and it is shown in panel b of Supplementary Fig. 17. The FoM of αiso shows up step-like features due to the BAO feature, as previously discovered25 and Pall offers the greatest FoM, until overtaken by P + B0 at \({k}_{\max } \, \gtrsim \, 0.37\,h{{{{{{{{\rm{Mpc}}}}}}}}}^{-1}\).

We use the first three even multipole moments to assemble the two-dimensional power spectrum, i.e.,

$$\begin{array}{r}P(k,\mu )\simeq \mathop{\sum}\limits_{\ell =0,2,4}{P}_{\ell }(k){{{{{{{{\mathcal{L}}}}}}}}}_{\ell }(\mu ),\end{array}$$

The bispectrum is similarly assembled using the first three even multipoles with m = 052, which are the most informative ones53, i.e.,

$$\begin{array}{r}B\left({k}_{1},{k}_{2},{k}_{3},{\mu }_{1},{\mu }_{2}\right)\simeq \mathop{\sum}\limits_{\ell =0,2,4}{B}_{\ell ,m = 0}\left({k}_{1},{k}_{2},{k}_{3}\right){Y}_{\ell ,m = 0}(\theta ,\phi ).\end{array}$$

The wave-number ki and the cosine of the angle to the line-of-sight μi are stretched by two dilation parameters α and α due to the AP effect54,55,

$$\begin{array}{r}{q}_{i}=\frac{{k}_{i}}{{\alpha }_{\perp }}{\left[1+{\mu }_{i}^{2}\left(\frac{1}{{F}^{2}}-1\right)\right]}^{1/2},\,\,{\nu }_{i}=\frac{{\mu }_{i}}{F}{\left[1+{\mu }_{i}^{2}\left(\frac{1}{{F}^{2}}-1\right)\right]}^{-1/2},\,\,F={\alpha }_{| | }/{\alpha }_{\perp }.\end{array}$$

The power spectrum multipoles (the index for the type is omitted for brevity) and bispectrum monopole including the AP effect are respectively given as,

$${P}_{\ell }(k)=\frac{(2\ell +1)}{2{\alpha }_{\perp }^{2}{\alpha }_{\parallel }}\int\nolimits_{-1}^{1}d\mu P\left(q,\nu \right){{{{{{{{\mathcal{L}}}}}}}}}_{\ell }(\mu ),$$
$${B}_{0}({k}_{1},{k}_{2},{k}_{3})=\frac{1}{4\pi {\alpha }_{\perp }^{4}{\alpha }_{\parallel }^{2}}\int\nolimits_{-1}^{1}d{\mu }_{1}\int\nolimits_{0}^{2\pi }d\phi B\left({q}_{1},{q}_{2},{q}_{3},{\nu }_{1},{\nu }_{2}\right).$$

The free parameters are \(\{{\alpha }_{\perp },{\alpha }_{| | },\ln {A}_{\ell },\ln {A}_{B}\}\), where A( = 0, 2, 4) denotes the overall amplitudes of the power spectrum multipoles, and AB is the amplitude of the bispectrum monopole. The derivatives with respect to the parameters α and α are evaluated numerically by

$$\begin{array}{r}\frac{\partial {{{{{{{\boldsymbol{O}}}}}}}}}{\partial {\alpha }_{i}}=\frac{{{{{{{{\boldsymbol{O}}}}}}}}({\alpha }_{i}^{+})-{{{{{{{\boldsymbol{O}}}}}}}}({\alpha }_{i}^{-})}{2{{\Delta }}{\alpha }_{i}},\end{array}$$

where O {P, B0} denotes the observables, and the step size Δαi = 0.01. Then the constraints on α and α are derived after marginalising over the amplitudes A and AB.

The FoM for α, α is shown in panel c of Supplementary Fig. 17, and it shows a similar trend as FoM(αiso). The contour plot for α, α with \({k}_{\max }=0.4\,h{{{{{{{{\rm{Mpc}}}}}}}}}^{-1}\) is shown in Supplementary Fig. 18, further highlighting the strong constraining power of Pall in comparison to that of P + B0.

A joint BAO and RSD analysis on the N-body mocks

In addition to α, α, we add one more parameter to the analysis, which is Δv, the parameter describing the change of velocities along the line of sight. This parameter mimics the change of the linear growth rate on large scales, but it also changes the velocity of particles coherently on small scales. We compute the derivatives with respect to Δv numerically.

The projection onto the parameters, shown in Supplementary Fig. 19, demonstrates the advantage of performing a joint analysis using \({P}_{{{{{{{{\rm{pre}}}}}}}}}\), Ppost and Pcross. On large scales, \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost are both determined by the linear density field, making the power spectra highly correlated. As shown in panels c1 and c5, the contours derived from \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost have similar orientations and we do not gain by combining them. For k > 0.15 hMpc−1, the correlation between \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost decreases as the pre-reconstructed density field is dominated by the non-linear field while Ppost still retains the correlation with the linear density field. The contours shown in lines in panel c7, which are derived from power spectra in the k range of [0.2, 0.25] hMpc−1, are almost orthogonal to each other, making the constraint from the combined spectra, as illustrated in the shaded region, significantly tightened. On smaller scales, the post-reconstructed density field is also dominated by the non-linear field and the orientations of the contours are again aligned and the complementarity on smaller scales weakens. This shows that the level of nonlinearity in the power spectrum determines the degeneracies between parameters. Since \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost are affected by different levels of nonlinearities on a given scale, which gives rise to different degeneracies, a joint analysis using both \({P}_{{{{{{{{\rm{pre}}}}}}}}}\) and Ppost (and Pcross) can yield a better constraint by breaking the degeneracies.