A time-resolved picture of our Milky Way’s early formation history

The formation of our Milky Way can be split up qualitatively into different phases that resulted in its structurally different stellar populations: the halo and the disk components1–3. Revealing a quantitative overall picture of our Galaxy’s assembly requires a large sample of stars with very precise ages. Here we report an analysis of such a sample using subgiant stars. We find that the stellar age–metallicity distribution p(τ, [Fe/H]) splits into two almost disjoint parts, separated at age τ ≃ 8 Gyr. The younger part reflects a late phase of dynamically quiescent Galactic disk formation with manifest evidence for stellar radial orbit migration4–6; the other part reflects the earlier phase, when the stellar halo7 and the old α-process-enhanced (thick) disk8,9 formed. Our results indicate that the formation of the Galaxy’s old (thick) disk started approximately 13 Gyr ago, only 0.8 Gyr after the Big Bang, and 2 Gyr earlier than the final assembly of the inner Galactic halo. Most of these stars formed around 11 Gyr ago, when the Gaia-Sausage-Enceladus satellite merged with our Galaxy10,11. Over the next 5–6 Gyr, the Galaxy experienced continuous chemical element enrichment, ultimately by a factor of 10, while the star-forming gas managed to stay well mixed.

The formation of our Milky Way can be split up qualitatively into different phases that resulted in its structurally different stellar populations: the halo and the disk components [1][2][3] . Revealing a quantitative overall picture of our Galaxy's assembly requires a large sample of stars with very precise ages. Here we report an analysis of such a sample using subgiant stars. We find that the stellar age-metallicity distribution p(τ, [Fe/H]) splits into two almost disjoint parts, separated at age τ ≃ 8 Gyr. The younger part reflects a late phase of dynamically quiescent Galactic disk formation with manifest evidence for stellar radial orbit migration [4][5][6] ; the other part reflects the earlier phase, when the stellar halo 7 and the old α-process-enhanced (thick) disk 8,9 formed. Our results indicate that the formation of the Galaxy's old (thick) disk started approximately 13 Gyr ago, only 0.8 Gyr after the Big Bang, and 2 Gyr earlier than the final assembly of the inner Galactic halo. Most of these stars formed around 11 Gyr ago, when the Gaia-Sausage-Enceladus satellite merged with our Galaxy 10,11 . Over the next 5-6 Gyr, the Galaxy experienced continuous chemical element enrichment, ultimately by a factor of 10, while the star-forming gas managed to stay well mixed.
To unravel the assembly history of our Galaxy we need to learn how many stars were born when, from what material and on what orbits. This requires precise age determinations for a large sample of stars that extend to the oldest possible ages (around 14 Gyr) 9,12 . Subgiant stars, which are stars sustained by hydrogen shell fusion, can be unique tracers for such purposes, as they exist in the brief stellar evolutionary phase that permits the most precise and direct age determination, because their luminosity is a direct measure of their age. Moreover, the chemical element compositions determined from the spectra of their photosphere surfaces accurately reflect their birth material composition billions of years ago. This makes subgiants the best practical tracers of Galactic archaeology, even compared to main-sequence turn-off stars, whose surface abundances may be altered by atomic diffusion effects 13 . However, because of the short lifetime of their evolutionary phase, subgiant stars are relatively rare, and large surveys are essential to build a large sample of these objects with good spectra, which have not been available in the past.
With the recent data release (eDR3) of the Gaia mission 14,15 and the recent data release (DR7) of the LAMOST spectroscopic survey 16,17 , we identify a set of approximately 250,000 subgiant stars based on their position in the effective temperatures (T eff )-absolute magnitude (M K ) diagram (Fig. 1a). The ages (τ) of these subgiant stars are estimated by fitting to the Yonsei-Yale (YY) stellar isochrones 18 with a Bayesian approach, which draws on the astrometric distances (parallaxes), apparent magnitudes (fluxes), spectroscopic chemical abundances ([Fe/H], [α/Fe] where α refers to α elements Mg, Si, Ca, Ti), T eff and M K . As summarized in Fig. 1b, the sample stars have a median relative age uncertainty of only 7.5% across the age range from 1.5 Gyr to the age of the Universe (13.8 Gyr; ref. 19 ). The lower age limit of our sample is inherent to our approach: younger and hence more luminous subgiants can be confused with a different stellar evolutionary phase, the horizontal branch phase for far older stars, which would cause serious sample contamination. This sample constitutes a 100-fold leap in sample size for stars with comparably precise and consistent age estimates 20,21 . In addition, it is a large sample that covers a large spatial volume across the Milky Way (Fig. 1c) and most of the pertinent range in age and in metallicity (1.5 Gyr < τ < 13.8 Gyr, and −2.5 < [Fe/H] < 0.4). The sample also has a straightforward spatial selection function that allows us to estimate the space density of the tracers. These ingredients enable an alternative view of the Milky Way's assembly history, especially the early formation history.

Our Galaxy's stellar age-metallicity distribution
The photospheric metallicity of any subgiant star of age τ reflects the element composition of the gas from which it formed at the epoch τ Gyr ago. The overall distribution of these stellar metallicities at different epochs, p(τ, [ which is illustrated as a yellow shaded area in Fig. 2b  interval (Extended Data Fig. 1 This feature is a second aspect of the distribution that has not, to our knowledge, been seen before 21 .

Formation and enrichment of the Milky Way's old disk
Tentative hints for some of these features in p(τ | [Fe/H]) have been seen in earlier work 24,25 (see the discussion in the Supplementary Information) but these studies lacked the sample size or precision for definitive inferences about the Galactic formation history. Figure 2 shows clearly that the old, high-α 'thick' disk of our Milky Way started to form approximately 13 Gyr ago, which is only 0.8 Gyr after the Big Bang 19 , and extended over 5-6 Gyr, and the interstellar stellar medium (ISM) forming the stars was continually enriched by more than 1 4,26 . The results also show that the formation of the Milky Way's old, α-enhanced disk overlapped in time with the formation of the halo stars: the earliest disk stars are 1-2 Gyr older than the major halo populations at [Fe/H] ≃ −1 (see the Z-shaped structure).

Article
In Fig. 3 we examine the p τ ( |[Fe/H]) early distribution more closely by separating stars with at least modest angular momentum, J ϕ > 500 kpc km s -1 , from those stars on nearly radial or even retrograde orbits, J ϕ < 500 kpc km s -1 . This further sample differentiation by angular momentum leads again to two nearly disjoint p(τ | [Fe/H]) distributions. The first (Fig. 3, upper panel) Note that Fig. 3, lower panel shows a distinct set of stars with J ϕ < 500 kpc km s -1 , for which the p(τ | [Fe/H]) locus indicates that they are the oldest and most metal-poor part of the old disk sequence (see also Extended Data Fig. 2). These stars indicate that some of the oldest members of the old disk sequence were present during an early merger event, by which they were 'splashed' to low-angular-momentum orbits 27,28 . This ancient merger event is presumably the merger with the Gaia-Enceladus satellite galaxy 11 (also known as Gaia Sausage 10 ; hereafter Gaia-Sausage-Enceladus), which has contributed most of the Milky Way's halo stars 7,29 . The fact that the splashed old disk stars with very little angular momentum are exclusively seen at τ ≳ 11 Gyr constitutes strong evidence that the major merger process between the old disk and the Gaia-Sausage-Enceladus satellite galaxy was largely completed 11 Gyr ago. This epoch is 1 Gyr earlier than previous estimates that were based on the lower age limit of the halo stars, 10 Gyr (refs. 11,21,30 ).  Figure 3 reveals a remarkable feature, namely that the star-formation rate of the old disk reached a prominent maximum at around 11.2 Gyr ago, apparently just when the merger with the Gaia-Sausage-Enceladus satellite galaxy was completed, and then continuously declined with time. The most obvious interpretation of this coincidence is that the perturbation from the Gaia-Sausage-Enceladus satellite galaxy greatly enhanced the star formation of the old disk. Note that this star-formation peak among the old disk stars ~11 Gyr ago is very consistent with earlier indications of such a peak based on abundances only 31 .
To put our results into the bigger picture of galaxy formation and evolution, the multiple assembly phases are seen to be universal among present-day star-forming galaxies. Using the IllustriesTNG simulation, Wang et al. 32 showed that galaxy mergers and interactions have played a crucial role in inducing gas inflow, resulting in multiple star formation episodes, intermitted by quiescent phases. Observationally, the best testbed for this theoretical picture would be here at home within our Galaxy. Our study has demonstrated the power of such tests for galactic assembly and enrichment history in the full cosmic timeline, from the very early epoch (τ ≃ 13 Gyr or redshift z > 10) to the current time.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-022-04496-5.

Fig. 3 | Probability of stellar distribution in the J ϕ versus [Fe/H] plane, p(τ, [Fe/H]), for stars formed in the early phase.
The stars formed in the early phase are divided into J ϕ > 500 kpc km s -1 (upper) and J ϕ < 500 kpc km s -1 (lower). The stellar distribution probability is normalized to the peak value so that the colour from blue to red represents a value from 0 to unity. Note that this is different from p(τ | [Fe/H]) in Fig. 2 Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Stellar labels from spectroscopy
Building this sample of subgiant stars with precise ages, abundances and orbits requires a number of steps. The first step is to derive stellar atmospheric parameters from the LAMOST DR7 spectra, which we did using the data-driven Payne (DD-Payne) approach, verified in detail using analogous data from LAMOST DR5 (ref. 33 ). This leads to a catalogue of effective temperature T eff , surface gravity log g, microturbulent velocity v mic and elemental abundance for 16 elements (C, N, O, Na, Mg, Al, Si, Ca, Ti, Cr, Mn, Fe, Co, Ni, Cu, Ba) values for 7 million stars. We also derive an α-element to iron abundance ratio [α/Fe], which will serve in the age estimation to identify the right set of isochrones for each object. For a spectral signal-to-noise ratio (S/N) higher than 50, the typical measurement uncertainties are about 30 K in T eff and 0.05 dex in the abundances we use here: [Fe/H] and [α/Fe] (ref. 33 ).

Absolute magnitude and spectroscopic parallax
Determining accurate and precise absolute magnitudes is crucial for age determination of subgiant stars (Fig. 1a). The Gaia astrometry provides high-precision parallax for stars within approximately 2 kpc, whereas for more distant stars the Gaia parallaxes have uncertainties in excess of 10%. For these distant stars, spectroscopic estimates of absolute magnitude are needed to ensure precise age determination. We derive M K , the absolute magnitude in the Two Micron All Sky Survey (2MASS) K band, from the LAMOST spectra, using a data-driven method based on neural network modelling (see Supplementary Information for details). Extended Data Figure 3 illustrates that for LAMOST spectra with high signal-to-noise ratio (S/N > 80), our spectroscopic M K estimates are precise to better than 0.1 mag at [Fe/H] = 0 (and 0.15 mag at [Fe/H] = −1). Furthermore, a comparison between spectroscopic M K and geometric M K from Gaia parallaxes provides an efficient way of identifying unresolved binaries 33,34 (Extended Data Fig. 3). For the subsequent modelling, we combine these two approaches through a weighted mean algorithm Here M K geom refers to the geometric M K , i.e., M K derived using Gaia parallax, M K spec the spectroscopic M K estimates, and σ the uncertainty in the M K estimates. We are then in a position to select subgiant stars as lying between the two straight lines in the T eff -M K diagram. As isochrones depend on [Fe/H], this is done separately for each [Fe/H] bin, with the adopted slopes and intercepts for the boundary lines presented in Extended Data Table 1. As an example, the boundaries for stars with solar metallicity are shown in the Fig. 1a

Cleaning sample cuts
To have a subgiant star sample with high purity, we have applied cleaning criteria to discard stars with poor data quality or stars that are possible contaminations of the subgiant sample.
• We discard unresolved binaries that we identify through differences in their spectro-photometric parallax and their geometric parallax from Gaia, by requiring Here ϖ spec−photo is the spectro-photometric parallax deduced from the distance modulus using the spectroscopic M K and 2MASS apparent magnitudes 35 .
• We discard stars with spurious Gaia astrometry by requiring a Gaia re-normalized unit weight error (RUWE) larger than 1.2 or an astrometric fidelity less than 0.8 (ref. 36 ). • We discard stars that show significant flux variability according to the variation amplitude of the Gaia magnitudes between different epochs, where PHOT_G_N_OBS is the number of epochs, and PHOT_G_MEAN_ FLUX_OVER_ERROR is the mean flux over error ratio for Gaia G-band photometry. We calculate the ensemble median Δ ( ) G and dispersion σ(Δ G ) of Δ G as a function of G-band magnitude and define any one star as a variable if Most of the variables eliminated by this criterion are found to be pre-main-sequence stars.
• We discard stars that are less luminous than the subgiant branch of a 20 Gyr isochrone, which is the boundary of our isochrone grid. Such stars are mainly contaminations of either pre-main-sequence stars or main-sequence binary stars that survived elimination by the above criteria. • We discard all stars with M K brighter than 0.5 mag to avoid contamination from He-burning horizontal branch stars. This comes at a price: we eliminate essentially all stars younger than about 1.5 Gyr. • We require all stars in our sample to have LAMOST spectral S/N > 20 and to have good DD-Payne fits, by requiring 'qflag_χ 2 = good' 33 .
We further restrict our stars to have T eff < 6,800 K, where DD-Payne abundances are most robust. After these cleaning cuts, the remaining sample contains 247,104 stars ( Fig. 1), all of which are presumed to be subgiants.

Age estimates by isochrones
The ages of the subgiant sample stars are determined by matching the Gaia astrometric parallax ϖ, the LAMOST spectroscopic stellar parameters T eff , M K , [Fe/H] and [α/Fe], and the Gaia and 2MASS photometry in the G, BP, RP, J, H and K bands with the YY stellar isochrones 18,37 using a Bayesian approach (see Supplementary Information for details) . Note that in our Bayesian model we have chosen not to impose a prior that all stars should be younger than the current knowledge of the age of the Universe from the cosmic microwave background measurements of Planck (13.8 Gyr) 19 . This is for two main reasons. First, the upper limit of the stellar age is an independent examination of the age of the Universe, whereas imposing age priors on the inference from the cosmological model might induce bias into the results. Second, imposing an upper age limit may increase the complexity of the statistics.
To convert the Gaia parallax to absolute magnitudes, we also need to know the extinction. Therefore, we have determined the reddening and extinction for individual stars using intrinsic colours empirically inferred from their stellar parameters (see Supplementary Information for details).
We have also tested the age estimation using other public isochrones, such as the MIST 38,39 , and find that, in the case of the solar α-mixture, the age estimates based on YY and MIST show good consistency except for the fact that the MST isochrones predict ages older by 0.5 Gyr (Extended Data Fig. 4). However, the α-element enhancement, which is not available in the current public MIST isochrones, has a large impact on the age estimation, and ignoring the α-element enhancement will lead to an overestimate of stellar age by up to 2 Gyr for old stars (Extended Data Fig. 4). Ages from the YY isochrones seem to be reasonable as the ages of the oldest stars are comparable to the age of the Universe (Fig. 2).

Orbital actions
Using the radial velocity from the LAMOST data, proper motions from Gaia and a combination of spectro-photometric distance and geometric distance (see Supplementary Information for details), we compute the orbital actions (J R , J ϕ , J Z ) and the angles of our sample stars using galpy 40 , assuming the MWPotential2014 potential model. We assume that the Sun is located at R ⊙ = 8.178 kpc (ref. 41 ) and Z ⊙ = 10 pc above the disk mid-plane 42 . We assume the local standard of rest LSR = 220 km s -1 , and the solar motion with respect to the LSR to be (U ⊙ , V ⊙ , W ⊙ ) = (−7.01 km s -1 , 10.13 km s -1 , 4.95 km s -1 ) (ref. 43 ).

Accounting for selection effects
To verify that our findings are not caused by artefacts due to selection effects, we adopt two approaches to address this issue. First, we apply our target selection to the Gaia mock catalogue of Rybizki et al. 44 and investigate the age-[Fe/H] relation (Extended Data Fig. 5). Second, we directly correct for the volume selection function of our sample to account for the fact that, for a given line of sight, older subgiant stars probe to a smaller distance than the younger stars as the former are fainter. The age distribution of the thick disk stars after applying the selection function correction is illustrated in Fig. 3. Eventually, we concluded that the selection function has a negligible impact on our conclusions (see Supplementary Information for more details).
In addition, we have compared the stellar age-[Fe/H] relation from our sample with literature results for both stars 25 and globular clusters 45-47 that have robust age estimates (Extended Data Fig. 6). The comparisons are qualitatively consistent, albeit the literature samples are too small to draw a clear picture of the assembly and enrichment history of our Galaxy (see Supplementary Information for a detailed discussion).

Data availability
The Gaia eDR3 data is public available at https://www.cosmos.esa.int/ web/gaia/earlydr3 The LAMOST DR7 spectra data set is public available at http://dr7.lamost.org. The subgiant star catalogue generated and analysed in this study is provided as Supplementary Table 1, and it can also be reached through a temporary path https://keeper.mpdl. mpg.de/d/019ec71212934847bfed/. The YY isochrones adopted for age determination in this work is public available at http://www.astro.yale. edu/demarque/yyiso.html.

Code availability
The stellar orbit computation tool galpy adopted in this work is public available at http://github.com/jobovy/galpy. The DD-Payne code adopted for determining stellar labels, the neural network code for determining M K from the LAMOST spectra and the Bayesian code for stellar age estimation are currently not publicly accessible online, as they are a part of ongoing survey data analysis efforts that will be applied to the upcoming LAMOST survey spectrum set. However, the codes can be shared on request.