The metal-poor atmosphere of a Neptune/Sub-Neptune planet progenitor

Young transiting exoplanets offer a unique opportunity to characterize the atmospheres of fresh and evolving products of planet formation. We present the transmission spectrum of V1298 Tau b; a 23 Myr old warm Jovian sized planet orbiting a pre-main sequence star. We detect a primordial atmosphere with an exceptionally large atmospheric scale height and a water vapour absorption at 5$\sigma$ level of significance. We estimate a mass and density upper limit (24$\pm$5$M_{\oplus}$, 0.12gm/$cm^{3}$ respectively). V1298 Tau b is one of the lowest density planets discovered till date. We retrieve a low atmospheric metallicity (logZ=$-0.1^{+0.66}_{-0.72}$ solar), consistent with solar/sub-solar values. Our findings challenge the expected mass-metallicity from core-accretion theory. Our observations can be explained by in-situ formation via pebble accretion together with ongoing evolutionary mechanisms. We do not detect methane, which hints towards a hotter than expected interior from just the formation entropy of this planet. Our observations suggest that V1298 Tau b is likely to evolve into a Neptune/sub-Neptune type of planet.

Exoplanet population studies reveal the crucial impact of planet formation and early evolutionary mechanisms [E.g, see 1, 2] on their demographic characteristics.However, evolutionary processes such as, atmospheric mass loss driven by host star XUV flux [3], interior cooling [E.g, see 4], and contraction [E.g, see 5] can significantly alter their thermal structure and composition within the first 100Myrs, thereby obscuring the imprints of planet formation.In this context, young transiting exoplanets represent a unique opportunity to probe the atmosphere of freshly formed planets and test formation and early evolution theories [6][7][8][9].However, studying these young planets is challenging as most of them do not have well constrained masses, due to large uncertainties in radial velocity (RV) measurements from their highly variable host stars [10].Young stars are known to have large spot coverage and frequent flaring activity [11,12], which can contaminate the measured transmission spectrum due to the transit light source effect [13].Most of the known young transiting planets (E.g, see [14,15]) lie above the radius-valley [e.g, see 16] and are theoretically predicted to be Neptune or sub-Neptune/super-Earth progenitors [5,17].
The raw HST images were reduced using a custom pipeline [27] (See Methods Data Reduction for details).We extract a broadband integrated 'white' light curve in the HST/WFC3 G141 bandpass (1.12µm-1.65µm)and use a divide-white common mode approach to derive systematics-corrected spectroscopic light curves [28].The extracted white and de-trended spectroscopic light curves are shown in Extended Data Figure 4 and 5 respectively.The detrended spectroscopic light curves are fitted with a batman planetary transit model, linear limb darkening and a linear stellar baseline ( For details, see Methods Light curve analysis and Table 1).We estimate the effect of unocculted star spots on the transmission spectrum using techniques outlined in [13] (See Methods Stellar activity).

Results
The transmission spectrum of V1298 Tau b (see Figure 1) shows a high amplitude absorption feature around the 1.4µm water band (∼ 400 ppm), which is the largest among the known Neptune/super-Neptune mass planets, such as HAT-P-26b [29] (250 ppm) and GJ 3470b (150 ppm) [30].The water absorption amplitude is large compared to well studied hot Jupiters, such as HD209458b (∼200ppm, [31]) The amplitude of the water feature is indicative of a large atmospheric scale height, revealing an extended H-rich atmosphere.Assuming a cloud free, H/He rich and isothermal atmosphere we constrain the scale height of this planet (∼ 1000km), from which we estimate the mass to be 24 ± 5M ⊕ assuming a clear atmosphere (see Methods, Mass estimate) using a known method [32].This mass estimate becomes an upper limit if the atmosphere is partly cloudy or hazy.Radial velocity measurements of this system report Jovian/sub-Jovian mass (220 ± 70M ⊕ ) for V1298 Tau b [20,24].However, our observation rules out a 100M ⊕ (∼ 2σ lower limit from [24]) transmission spectrum model (Figure 1) at ∼ 5σ confidence.We compare the derived mass and radius of planet b to the population of exoplanets (Figure 2).V1298 Tau b, with a density upper limit of 0.12gm/cm 3 , is comparable to lowest density planets known ('super-puffs')[E.g,see 33], however V1298 Tau b has a clear atmosphere compared to most super-puffs [E.g see 34].The estimated mass upper limit of V1298 Tau b is consistent with a Neptune/sub-Neptune mass planet with a substantial H/He envelope (∼40%, Figure 2).
The observed transmission spectrum (without stellar activity correction) is modelled using a 1D radiative transfer code (PetitRADTRANS) [35].We fix the planet mass to the estimated upper limit (24M ⊕ ).We model the atmosphere using an analytic temperature-pressure profile [36] and a gray cloud deck.The dominant carbon bearing species at 670K is expected to be methane [37].However, we do not detect methane absorption around 1.6µm (Figure 1).Absence of methane has been reported for other warm planets [E.g, see 30]. which can be potentially explained by vertical mixing [37].We simulate the effect of vertical mixing using a 'quench' pressure in our models (See Methods Atmospheric modelling).The observed and modelled transmission spectra, and the retrieved atmospheric properties are shown in Figure 1.The retrievals converge to a low atmospheric metallicity (solar/sub-solar) compared to theoretical expectations from core-accretion [38] and known constraints for sub-Neptunes/super-Earths [E.g, see 39,40].The observed spectrum can be equally explained by lower planet masses but even lower metallicities (0.1-0.01 solar).Cloudy models are statistically favoured to cloud free models (See Extended Data Figure 8 and Table 2).From the derived mass upper limit (this work) and radius measurement from [18], we derive an upper limit of 6×solar on the atmospheric metallicity of V1298 Tau b applying the formalism from [41].This upper limit should be interpreted cautiously as the models presented in [41] do not account for high interior flux from the planet.Since V1298 Tau b is young with a potentially hot interior, the true bulk metallicity could be higher.However, the posteriors from our retrievals (Extended Data Figure 9) rule out 6×solar values at ∼ 3σ, implying relatively unmixed interior -atmosphere structure for this planet.
We present V1298 Tau b in the context of the exoplanet population (Figure 2; lower panel).V1298 Tau b has a mass consistent with a Neptune/sub-Neptune or even potentially a super-Earth and a metallicity comparable or lower to Jupiter.High metallicity atmosphere (100×solar) with the estimated mass upper limit of this planet can be ruled out at ∼ 5σ confidence (See orange dashed model in Figure 1).Therefore, in spite of being a likely Neptune/sub-Neptune, or even a super-Earth progenitor, V1298 Tau b possesses an atmosphere that is 100-1000 times metal depleted compared to Neptune and Uranus.
In addition, we studied the origin of the absence of expected methane.We performed chemical kinetics models which incorporate a self-consistent T-P profile and vertical mixing (see Methods Atmospheric models).These  2).models with different internal temperatures demonstrate that it is possible to remove methane through deep quenching, although it requires a high interior temperature (see Fig. 13).At the highest intrinsic temperature we have tested (T int = 400 K), the quenched molar fraction of methane is still 10 −4.7 , which is close to the observability limit (10 −5.5 ) of HST for methane [37].Retrievals using free chemistry (See Methods Atmospheric Modelling and Supplementary Information Figure 1) put an upper limit of 10 −6 on the methane Volume Mixing Ratio (VMR).

Discussion
The differences between the mass estimate from the atmospheric scale height and those from dynamical studies [20,24] could potentially originate from the treatment of the impact of stellar activity on RV signals.The robustness of mass estimates from RVs of this system have been questioned recently Fig. 2 Upper panel: V1298 Tau b (Red star) in the mass-radius diagram; the mass upper limit was calculated from the observed transmission spectrum using the formalism presented in [32].V1298 Tau b is shown in comparison with other known young planets and lowdensity 'super-puffs'.Grey dots are known mature planets(obtained from NASA Exoplanet Archive).The blue dotted lines are theoretical models from [42], and show that the measured mass and radius of V1298 Tau b is consistent with an atmosphere with a significant H/He envelope (∼ 40% mass fraction assuming 10M ⊕ core).V1298 Tau b is amongst the lowest density (0.12gm/cm 3 planets discovered.Lower panel: V1298 Tau b (red star) shown in a Mass-metallicity diagram with a sample of exoplanets compiled from [43].The atmospheric metallicity is derived from the retrieval analysis (See Results section).The dashed blue and brown lines show the metallicities of Neptune and Jupiter respectively.We note that the solar system metallicity estimates are from methane abundance measurements [44], whereas for exoplanets the metallicity estimates are derived from oxygen abundance measurement.
V1298 Tau b has a mass consistent with Neptune/sub-Neptunes or potentially even super-Earths, but its metallicity is comparable to giant planets like Jupiter.The solid lines show a high stellar activity track (Activity timescale 250Myr) and the dashdot lines show a low stellar activity track (Activity timescale 100Myr, for details see [45]).These models show that for moderate photoevaporation efficiency and high stellar activity, this planet is likely to lose mass and end up as a Neptune/sub-Neptune or even potentially a super-Earth depending on its mass.[10].There is also uncertainty on the orbital period of planet e [21] in this system which could significantly impact the RV mass constraints.Efforts to characterize the mass from transit timing variation measurements are ongoing (Livingston et.al, in prep).The low envelope metallicity and relatively large H/He content that we measure for V1298 Tau b is in agreement with early evolution models [42], yet this planet must have been on the verge of runaway gas accretion.We emphasise that the origin and early evolution of Neptunes/sub-Neptunes has been an open question: it is unknown why these planets accreted only a small fraction of H/He and did not become gas giants [7,8].Such systems likely formed insitu; either early with an enhanced atmospheric opacity due to dust grains [7], or with significant disk-envelope interaction to replenish the proto-atmosphere The metal-poor atmosphere of a Neptune/Sub-Neptune planet's progenitor with high entropy gas [46].Late formation in a depleting transitional disk such that the core does not have enough time to accrete a large amount of H/He envelope [6] can also produce Neptune/sub-Neptune mass planets.
The standard core-accretion picture of planet formation [38] predicts a mass-metallicity relationship, which has been observed in the solar system [44] and also reported for exoplanets [47].The relatively water poor atmosphere of V1298 Tau b that we find in this work indicates that this planet must have spent most of its accretion phase within the water ice line, thereby failing to accrete volatile rich pebbles [48].The volatile content of the inner disk can be strongly affected by the growth of massive planets in the outer part of the disk [49].In this scenario, a massive planet, formed beyond the water ice line, blocks the supply of volatile rich pebbles in the inner part of the disk, thereby making the inner disk dry and the timescale for core assembly by pebble accretion longer.RV constraints on the mass of V1298 Tau e puts it in a Jupiter/sub-Jupiter range [20,21,24] with possible orbital period greater than 40 days.Therefore, pebble filtering could play an important role in this system by producing volatile poor atmospheres of the inner planets.
Alternatively, V1298 Tau b could have accreted volatile rich material that ended being locked up in the interior of the planet.In addition, young planets could experience extreme mass loss driven by intense XUV flux of their active host stars.Using updated mass constraints for this planet we simulated the mass and radius evolution (See Figure 3 and Methods Atmospheric Evolution Models).We estimate the Jean's escape parameter [50]) to be 27.Our calculations suggest that V1298 Tau b is susceptible to photoevaporation in contrast to the conclusions obtained in [26] based on RV mass estimates.V1298 Tau b may lose up to a few Earth masses within first 1Gyr of its life .Rocky pebble / planetesimal accretion theory of planet formation [51] predict a gradually mixed interior structure as observed for Jupiter [52].We show two possible interior and evolution models for V1298 Tau b (Extended Data Figure 7); the core-envelope structure and the diluted core structure.The observed mass, radius and metal poor envelope can all be explained by both models, however in the diluted core scenario, the atmospheric metallicity is expected to evolve due to the removal of the upper layer of the atmosphere due to mass loss as well as convective mixing in the interior which could ultimately reconcile V1298 Tau b with the mature exoplanet population [53].
Self-consistent atmospheric modelling for V1298 Tau b shows that we require extremely high internal temperature (∼400K) and strong vertical mixing to explain the non-detection of methane.We show the internal temperature from the early evolution models (Extended Data Figures 7) are consistent with theoretical expectations [37].Internal temperatures as high as 300-400K may require external heating mechanisms, such as tidal heating [54].Alternatively, photolytic destruction of methane could also potentially produce a methane poor atmosphere [55], which may be feasible given the youth and high activity levels of V1298 Tau.We test this hypothesis by running self-consistent forward model using published UV spectrum of V1298 Tau [56].However, for V1298 Tau b photochemistry does not impact the methane abundance for pressures higher than 10 −4 bar even for extreme case (1000×solar XUV flux, Supplementary Figure 4).
In conclusion, a strong detection of a water vapour absorption feature in the NIR spectrum of V1298 Tau b allows us to put a stringent upper limit on its mass (24±5M ⊕ ).The observed spectrum does not exhibit methane feature and is best interpreted with a solar/sub-solar metallicity atmosphere.Leveraging the absence of spectral signature of methane we provide constraints on the internal temperature of the planet.We find that the V1298 Tau system is likely to have formed either late, within the water ice line in a gas-poor, dry and depleting protoplanetary disk, or early in the inner region of the disk with an accretion rate likely moderated by disk gas replenishment or enhanced envelope opacity.V1298 Tau b is likely to undergo atmospheric mass loss and could end up as a Neptune or a low density sub-Neptune or even potentially a super-Earth (see Figure 3).

Methods Observation
The observations were taken using HST/WFC3 G141 grism in bi-directional spatial scanning mode, covering a range of 1.1-1.7 µm, with a scan rate of 0.23"/sec.This resulted in 180 exposures over 10 HST orbits.The individual pixels reached a maximum flux level of 30,000 electrons which is roughly 40% of the saturation level and well within the linear response regime of the detector.We used the 256 × 256 pixel subarray and SPARS25, NSAMP=5 readout mode which resulted in 88.4 s exposures.

Data reduction
We use a custom data reduction pipeline for our data analysis [27,57].The WFC IR detectors are read multiple times non-destructively (without flushing out the accumulated charge) during an exposure.First, sub-exposures are formed for each exposure by subtracting consecutive non-destructive reads and each sub-exposure is reduced separately for improved background subtraction and cosmic ray rejection.We calculate a wavelength solution by matching the first exposure of the visit to a convolution of a PHOENIX stellar spectrum [58] for V1298 Tau (T ef f =4920K) with the response function of G141.
We apply a wavelength dependent flat-field correction and flag bad pixels with data quality DQ=4, 32 or 512 by calwf3 and apply a local median filter to identify cosmic rays and clip pixels that deviate more than five median deviations.On average we find 0.53% pixels affected by cosmic rays for each sub-exposure.To account for the dispersion direction drift of the spectrum we use the first exposure of a visit as a template and shift the spectrum for each exposure along the dispersion direction to match the template.The maximum shift that we measure is 0.3 pixels.Finally, we apply an optimal extraction algorithm [59] on each sub-exposure to maximize signal-to-noise ratio.We shift and shrink the spectra of each sub-exposure to match the wavelength grid of the first sub-exposure by a maximum of 1.05 pixels and 0.65%.

Light curve analysis
WFC3 light curves are known to exhibit strong time dependent ramp-like (charge-trapping) and visit-long systematics [28,31,60].It has been known that the first orbit of each visit has stronger systematics compared to the rest of the visit.Following common practice [61], we exclude this orbit from the rest of the analysis.We modelled the white light curve instrumental systematics using a charge-trapping model, RECTE [62].The out of transit baseline is a combination of instrumental visit long slopes, well known for HST/WFC3 time series observations [61] and rotational variability from the active young host star.Visit-long slopes have been modelled using linear functions in time [61,63], however the significant non-linearity exhibited by the baseline highlights the effects of stellar variability.We test polynomial functions of first order to fourth order as well as sinusoidal function to model the baseline.A polynomial of third order provides the best fit (lowest BIC value) to the observations.Therefore, model the stellar baseline using a third order polynomial and the stellar disk using a linear limb darkening model.The best fit polynomial function is shown in Extended Data Figure 4 and shows ∼ 0.3% variability during the entire visit.The planetary transit signal is modelled using batman [64], where we fix the orbital parameters to known literature values [19][20][21].We ran an MCMC using emcee [65] to estimate model parameter uncertainties (Figure 10.We find the ninth exposure of the seventh orbit to be affected by a satellite crossing event and exclude this exposure [66]. We generate 7 pixel bin spectroscopic light curves from the reduced 1D stellar spectra across 17 wavelength channels.We de-trend the spectroscopic light curves using a common-mode approach given the deviations from the standard HST instrument systematics (possibly due to stellar activity).The common-mode divide-white has been used previously for WFC3 analysis [28]; it adopts an agnostic approach to the exact mathematical form of the instrument systematics assuming it is wavelength independent.We model the spectroscopic light curves using a batman model and a linear stellar baseline.We fit for the linear limb darkening coefficient.The observed white light curve, best fit transit model and the derived systematics function are shown in Figure 4.The systematics de-trended spectroscopic light curves along with the residuals are shown in Figure 5.We also derive the transmission spectrum by fitting each spectroscopic light curve using a RECTE and polynomial stellar baseline models and the derived spectrum agree within 1σ to the commonmode spectrum.However, the quality of the fits in the common-mode approach are superior.The residual noise in all the spectroscopic channels is less than 1.3 times the expected photon noise and the average precision on the extracted transit depths is 47ppm.The fitted transit depths and linear limb-darkening coefficients are shown in Table 1.The rms noise is relatively high [67], however this could be a combination of stellar variability, spot crossings and high measured x-shifts.
We note a possible bright spot occultation in the third orbit and also a potential flaring event affecting the latter half of the seventh orbit (Extended Data Figure 4).To estimate the effect of these exposures on the derived transmission spectrum, we fit the spectroscopic light curves with and without these exposures.We do not find any change in the derived transmission spectrum and the average residuals decrease by 3 ppm when these exposures are excluded.We conclude that the removal of these exposures do not have a significant manifestation on the spectrum.We also test the effect of the large horizontal drift of the telescope.We incorporate a linear function of x-shifts as a correction factor for the white light curve fits, following the approach of [68].We find ∆BIC=3 when we include horizontal drift into the fitting algorithm and hence we conclude that including the effect of horizontal drifts is not statistically significant.

Accounting for stellar activity
V1298 Tau is a young pre-main sequence star, known to exhibit 2% variability in Kepler and TESS light curves [19][20][21].Variability in such young stars can be attributed photospheric inhomogeneity (star spots and faculae) and fast stellar rotation.Unocculted star spots can contaminate the observed transmission spectrum [69].We estimate the effect of stellar contamination on the transmission spectrum of V1298 Tau b following the prescription of [13].We adopt a surface inhomogeneity model (20% spot coverage) for V1298 Tau from [70].Photospheric temperature contrasts have been studied for T Tauri stars [71]; stars with photospheric temperatures similar to V1298 Tau can have spot temperature contrast upto 1000K.We estimated an extreme case contamination spectrum for V1298 Tau assuming 20% spot coverage and 1000K spot temperature contrast.The contamination corrected spectrum is consistent within 1σ of the uncorrected spectrum.A comparison between the corrected and uncorrected spectra is shown in Extended Data Figures 11.We re-run retrievals on the contamination corrected transmission spectrum.The retreivals are identical in setup to the uncorrected case (See Methods Atmospheric Models).The posterior distributions of the parameters are shown in Extended Data Figure 9 with the posteriors from the uncorrected spectrum.All the parameters agree for both cases within 1σ.The retrieved atmospheric metallicity in the corrected case prefers more sub-solar values compared to the uncorrected case, thereby confirming the robustness of the conclusions drawn in this work.The retrieved paramters are shown in Table 2 The effect of stellar absorption has been seen in the limb darkening coefficients [e.g.see 28].We set the limb darkening coefficients as a free parameter while fitting for the spectroscopic light curves and the results have been tabulated in Table 1.The limb darkening coefficients do not show any effect of stellar absorption.To further convince ourselves that the water absorption The metal-poor atmosphere of a Neptune/Sub-Neptune planet's progenitor feature we see in the spectrum of V1298 Tau b around 1.4µm is of planetary origin, we define a quantity B as the ratio of flux observed in two wavelength bands.
where F is the electron per unit wavelength in the 1D extracted spectra of our reduced exposures, λ 1 and λ 2 give us lower and upper limit of the first wavelength band and λ 3 and λ 4 give us lower and upper limit of the second wavelength band.We calculate B for all the exposures, first using the wavelengths 1.25-1.35µm (left end of the water feature) and 1.45-1.55µm (right end of water feature) (Upper panel in Figure 6) and subsequently using 1.35-1.45µm (centre of water feature) and 1.45-1.55µm (Lower panel Figure 6).For the latter case we find an excess absorption during the transit of the planet which indicates that the water absorption of planetary origin.

Atmospheric Models
We use the publicly available 1D radiative transfer code PetitRadtrans to retrieve the atmospheric properties of V1298 Tau b from its observed transmission spectrum.The transmission spectrum does not show methane absorption signature around 1.6µm which would be expected for a warm planet like V1298 Tau b based on equilibrium chemistry.The lack of methane can be explained by disequilibrium processes, like vertical mixing [37,72] dredging up methane poor gas from the hot interior parts of the atmosphere.In our retrieval framework, we modelled this effect using a 'quenching pressure', where VMR of C, H, O, N bearing molecules are calculated using PetitRADTRANS., however above the quench point, the molecular concentrations are held constant.We model the atmospheric thermal structure with a Guillot T-P profile [36] shown in Eqn 2, where T equ and T int are the equilibrium and internal temperature of the planet.κ IR is the average infrared atmospheric opacity and γ is the ratio between optical and IR opacity.We constrain the models by fixing the the values of both κ IR to 0.01 cm 2 g −1 and γ to 0.01, assuming the atmospheric opacity at the observed band pass to be water dominated.We include H 2 O, CH 4 , CO 2 and CO opacities in our retrieval framework as these molecular species have absorption features in the NIR [73].We do not include HCN, NH 3 opacities in our retrievals as we do not find evidence of these species in free retreivals (See Supplementary Figure 1).We assume a grey cloud deck opacity model to simulate cloud absorption.
We fix the mass of the planet to 24M ⊕ , based on the mass upper limit estimated from the scale height.The free parameters in our models are atmospheric metallicity, C/O ratio, R p (radius of planet at reference pressure, 1 bar), T equ , T int , P quench and P cloud .We run an MCMC with 3,000 burn in steps and 30,000 post burn-in steps with 50 walkers.We put uniform priors on the fitting parameters.The posterior distribution of the fitted parameters is shown in Figure 9.We retrieve a sub-solar/solar metallicity.The retrieved equilibrium temperature is consistent with the expected equilibrium temperature of the planet.The retrieved parameters have been summarized in Table 2.
We test the importance of the internal temperature by fixing the internal temperature to 0K (i.e fitting for an isothermal atmosphere).High internal temperature models are statistically favoured by a ∆BIC=50.We also perform free retrievals using an isothermal atmosphere (See Supplementary Information Figure 1).This yields an upper limit to the methane Volume Mixing Ratio (VMR) in the atmosphere (∼ 10 −6 ) which is lower than the detection threshold for HST [37], thereby independently confirming the non-detection of methane.The free retrieval did not find evidence for other molecular species like HCN and NH 3 putting upper limits of 10 −6 on their VMRs.We explore the effect of fixing the planet's mass in Extended Data Figure 8 and Methods Mass estimate.
We construct self-consistent atmospheric models with varying internal temperatures to study the quenching of methane and CO in the deep atmosphere.We compute the T-P profile using petitCODE [74,75], assuming radiativeconvective equilibrium.Irradiation onto the planet is computed assuming a planetary-wide energy redistribution, with a host star effective temperature and radius of 4970 K and 1.31 R ⊙ , semi-major axis of 0.1688 AU, and planetary intrinsic temperatures of 100 K -400 K. Using our retrievals as a guidance, a solar metallicity was adopted with a slightly sub-solar C/O of 0.3.We achieved this C/O by reducing the carbon abundance from its solar value.The resulting temperature profiles are shown in Supplementary Figure 12.Subsequently, we use a 1D chemical kinetics model [76] in combination with a photochemical network [77] to calculate self-consistent vertical quenching pressures for the main atmospheric species.We perform our calculations with a constant eddy diffusion coefficient (K zz ) of 10 10 cm 2 /s.This value, although high, is in line with the expected values for convective mixing in giant planets and brown dwarfs [e.g.[78][79][80].We include photochemistry in our models, however, we find that it does not significantly affect the molecular abundances at pressures typically probed by transmission spectroscopy.We test the effect of higher XUV flux of host star by computing models for scaled solar spectra (10-1000 times, Supplementary Figure 3,4).The resulting chemical disequilibrium abundances for methane, CO, and water are shown in Supplementary Figure 13.We find that the planet should have high internal temperature (∼ 300-400K) to have the carbon chemistry to be CO dominated.This is consistent with the high internal temperature and deep quenching concluded from the retrieval analysis.

Mass estimate
We estimate the mass of V1298 Tau b from the transmission spectrum using the approach described in [32].
We use the radius measurement from Kepler [18] and an equilibrium temperature of 670K for the calculation.We estimate the scale height from the observed spectrum of V1298 Tau b.The height of an atmosphere can be estimated using Equation 1of [81]: In Eqn 4, σ is the absorption cross section at a given wavelength, z is the measured radius of the planet at a given wavelength.We estimate 2.7 scale heights to account for the 1.4µm water absorption feature, assuming a water dominated atmospheric opacity and a cloud free atmosphere.Given the young age and inflated size, we assume a primordial H/He rich atmosphere and fix the mean molecular mass to 2.33.We find a large atmospheric scale height for V1298 Tau b; 1000 ± 200 km and a mass estimate of 24±5M ⊕ .The reported radius of V1298 Tau b differs slightly between measurements from different epochs.K2 [18] and TESS [21] differ by ∼ 2σ, whereas [20] estimate the radius to be in between.We estimated the planet mass using all the three radius measurements, the results of which are tabulated in Supplementary Information Table 1.We also used the HST white light curve radius (0.84±0.003RJ to estimate the planet.All the estimates are consistent with each other within 1σ, and to be conservative we adopt the highest estimate of K2 (24±5M ⊕ ).
This estimate can be interpreted as an upper limit, given the assumption of cloud free.In case of a cloudy atmosphere, the measured scale height from the spectrum would be underestimated, therefore leading to an over estimation of the mass.Given the observed spectrum, a cloud-free case would therefore yield the maximum possible mass for this planet.
To estimate the impact on atmospheric parameters, we run retrievals on the observed transmission spectrum by fixing the temperature to 670K, for different masses (24, 15, 10, 5M ⊕ ) and both cloud free and cloudy cases (Extended Data Figure 8).We include the same molecules compared to the 24M ⊕ case, as molecular opacities do not depend on the planet's gravity.For the 5M ⊕ case, our retrievals did not did converge as it could not reproduce the water absorption signal.We can however fit the observations with 24,15 and 10M ⊕ models.We find that cloudy models are statistically favored compared to cloud free models.The 24M ⊕ (mass upper limit) model converges at solar atmospheric metallicity; for lower mass models our retrievals converge at even lower (0.1-0.01 solar) metallicities to fit the water absorption feature.We test the robustness of the estimated mass upper limit by running a retrieval with 40M ⊕ .This model fails to reproduce the observed water feature and can be rejected at high confidence.To further test the robustness of the mass estimate, we run an atmospheric retrieval, keeping mass as a free parameter.The posterior distribution is shown in Supplementary Information Figure 2. The mass posterior peaks around 10M ⊕ .The metallicity in this case yields an upper limit of solar value at 2σ.Therefore, the conclusions of mass less than 24M ⊕ and solar/sub-solar atmospheric metallicity appear robust based on this test.
Thus, from the transmission spectrum we can estimate a robust mass upper limit, and conclude that V1298 Tau b is likely to be Neptune or a low-density sub-Neptune or potentially a super-Earth progenitor [45].

Atmospheric evolution models
The atmospheric evolution models shown in Figure 3 have been simulated using the open source platypos code [45].The code calculates the mass loss rate at a given point in time, using the energy-limited mass loss formalism [e.g see 82,83] and evolve the planet's physical properties (mass and radius) at every step of the calculation.The radius evolution is a combined effect of atmospheric contraction and mass loss, and the updated size of the planets are calculated from the scaling relation given in [42].We adopt the stellar luminosity from [45].These simulations have performed considering the estimated mass upper limit (24M ⊕ ).For lower masses we can expect higher mass loss rates.

Comparison with Edwards (2022)
V1298 Tau b was included in a sample of 70 transiting exoplanets whose spectra have been shown in [84].The authors use a different pipeline (Iraclis [68]) for the data reduction.The authors also use a common-mode approach to derive the spectrum of this planet.The transmission spectrum obtained in this work is consistent within 1σ to the results of [84] except a constant offset of ∼ 500 ppm.The constant offset is a result of [84] using the third orbit in their white light curve fits which we choose to exclude because of a potential spot crossing in that orbit.We tested the effect of including the third orbit in the white light curve fits.We find a ∆BIC=170 in favour of excluding the third orbit from the fits.The transmission spectrum obtained by [84] and this work have been shown together for comparison in Figure 11.We used 1D atmosphere model, Guillot T-P profile [36], equilibrium chemistry with atmospheric quenching (See Methods Atmospheric models for details).We retrieve an atmospheric metallicity consistent with sub-solar/solar for both cases.The retrieved parameters with their 1σ confidence intervals are shown in Table 2.The stellar activity corrected spectrum is shown in comparison with the uncorrected spectrum in Extended Data Figure 11.In the same figure, the contamination function [13] used for correcting the spectrum is als shown.Log molar fraction Fig. 13 Chemical abundances of methane (black), CO (red), and water (blue) in the deep atmosphere, for T int = 100 K, 200 K, 300 K, and 400 K.Chemical abundances have been calculated using a self-consistent framework with petitCODE and a chemical kinetics model in combination with a photochemical network [76,77].Arrows denote increasing T int .Molecular abundances are shown for a composition in chemical equilibrium (dashed) and for chemical kinetics, i.e. when vertical quenching is included (solid).We included photochemistry in our models, however it did impact the molecular abundances at observable pressures.

M p [M ]
Fig. 15 Posterior distribution of atmospheric retreival on the uncorrected transmission spectrum of V1298 Tau b where mass was not fixed.The mass posterior distribution peaks around 10M ⊕ with an atmospheric metallicity 2σ upper limit at solar metallicity.The posterior distribution of the planet mass is in agreement with the upper limit quoted in this paper.

Fig. 1
Fig.1Left panel: Observed HST/WFC3 transmission spectrum (without stellar activity correction) of V1298 Tau b (green squares) with one-sigma error bars from which an upper limit of the planet mass is determined (24M ⊕ ) .Atmospheric retrievals with the estimated mass upper limit show that the observations are consistent with solar/sub solar atmospheric metallicity (Solid blue line).The dash-dotted black line shows a transmission spectrum for a 100M ⊕ , solar metallicity model, and the orange line represents a 24M ⊕ with a 100 times solar metallicity model.Both these models fail to capture the amplitude of the water feature and can be ruled out at ∼ 5σ confidence.The red dotted model represents an isothermal model, which shows an absorption feature around 1.6µm due to methane.An isothermal equilibrium chemistry model without high internal temperature and vertical mixing fails to explain the observed spectrum around 1.6µm (See Results section).Retrievals with lower masses have been explored in Methods Mass estimate (See Extended Data Figure8).Stellar activity corrected transmission spectrum (See Methods Stellar Activity) for V1298 Tau b is consistent within 1σ with observed uncorrected spectrum.Right panel: Retrieved T-P profile (24M ⊕ model) with the 1σ confidence interval (red shaded region).The dashed lines of different colours represent the equilibrium abundances for the chemical species included in our model (calculated for the red solid T-P profile).The magenta and blue dotted lines and the corresponding shaded region show the location of the retrieved grey cloud deck and quenching pressure from our retrieval analysis.(See Results section and Table2).

Fig. 3
Fig.3Mass (upper panel) and radius (lower panel) evolutionary tracks simulated for V1298 Tau b during the first Gigayear using energy limited atmospheric evolution models presented in the platypos code[45] (See Methods, Atmospheric evolution models).The radius evolution is a combined effect of atmospheric contraction and mass loss.Simulations for different values of the mass loss efficiency parameter (0.1,0.3,0.5) are shown with different colours.The solid lines show a high stellar activity track (Activity timescale 250Myr) and the dashdot lines show a low stellar activity track (Activity timescale 100Myr, for details see[45]).These models show that for moderate photoevaporation efficiency and high stellar activity, this planet is likely to lose mass and end up as a Neptune/sub-Neptune or even potentially a super-Earth depending on its mass.

Fig. 4 Fig. 5 55 Fig. 6
Fig. 4 Upper panel: Observed white light curve (1.1−1.65µm) of a primary transit of V1298 Tau b and best fit model (black solid lines) Second panel: Best fit planetary transit light curve model.Third panel: Systematics function model estimated by dividing the observed light curve in the upper panel by the best fit transit model shown in the second panel, following the prescription of [28].Black dashed line shows the best fit baseline model.Lower panel: Residual from the white light curve fits.The residuals in the third orbit indicate a possible bright spot crossing.The residuals at the end of the seventh orbit rise sharply.This could potentially be due to a flare event.Red and blue points denote forward and reverse scanned exposures respectively.Assuming wavelength independent instrumental and stellar systematics, Z(t) is used to de-trend the spectroscopically binned light curves.For further details, See Methods Light curve analysis).

Fig. 7
Fig.7Left: Evolution of radius (upper panel) and internal temperature (lower panel) of two possible formation-evolution tracks; core-envelope structure (black line) and diluted core structure (red line) of V1298 Tau b.Solid and dotted lines represent simulations with sub-solar (0.1 solar) and solar envelope metallicity.Right: the metal distribution in the interior as a function of pressure for the two models at its current age (23 Myr).Models are calculated for in-situ formation of planets with 35-45% H/He (in mass), starting from Hill sphere radius.Evolution model is based on[53].Both core-envelope models and diluted core models can explain the current size, mass and low metallicity envelope of V1298 Tau.(See Discussion for more details).

Fig. 8
Fig.8Comparison between the observed transmission spectrum of V1298 Tau b and retrievals using PetitRADTRANS[85] with different planet masses.The red, green and blue models represent retrieved median models for 10, 15 and 24M ⊕ models with a grey cloud opacity respectively.The corresponding dotted line show cloud free models at the same mass.The orange model represents a 5M ⊕ model; our retrievals failed to converge for this case.The black solid line shows a model with 40M ⊕ .The observation can be fitted with 24, 15 and 10M ⊕ models.For the lower mass cases our retrievals converged on extremely low atmospheric abundances (0.1-0.01 solar) to fit the water absorption feature.Our retrievals statistically favour cloudy models.

Fig. 9
Fig.9The posterior distribution from retrieval done on the uncorrected (red) and stellar activity corrected (blue) transmission spectrum of V1298 Tau b assuming a mass of 24M ⊕ .We used 1D atmosphere model, Guillot T-P profile[36], equilibrium chemistry with atmospheric quenching (See Methods Atmospheric models for details).We retrieve an atmospheric metallicity consistent with sub-solar/solar for both cases.The retrieved parameters with their 1σ confidence intervals are shown in Table2.The stellar activity corrected spectrum is shown in comparison with the uncorrected spectrum in Extended Data Figure11.In the same figure, the contamination function[13] used for correcting the spectrum is als shown.

Table 1
Table showing the best fit transit depths, linear limb darkening coefficients and RMS residual compared to expected photon noise for V1298 Tau b.

Table 2
Retrieved atmospheric parameters of V1298 Tau b from its transmission spectrum.See Methods Atmospheric models for details of atmospheric models used.

Table 3
Table showing the mass estimated using different radius reported in literature for V1298 Tau b