Abstract
Simulations predict that hot superEarth sized exoplanets can have their envelopes stripped by photoevaporation, which would present itself as a lack of these exoplanets. However, this absence in the exoplanet population has escaped a firm detection. Here we demonstrate, using asteroseismology on a sample of exoplanets and exoplanet candidates observed during the Kepler mission that, while there is an abundance of superEarth sized exoplanets with low incident fluxes, none are found with high incident fluxes. We do not find any exoplanets with radii between 2.2 and 3.8 Earth radii with incident flux above 650 times the incident flux on Earth. This gap in the population of exoplanets is explained by evaporation of volatile elements and thus supports the predictions. The confirmation of a hotsuperEarth desert caused by evaporation will add an important constraint on simulations of planetary systems, since they must be able to reproduce the dearth of closein superEarths.
Introduction
Models predict that the envelopes of exoplanets orbiting close to their host stars are stripped by photoevaporation, which should be evident as an absence of very hot superEarth sized exoplanets. The simulations by ref. 1 show a deficit in the number of exoplanets with radii between 1.8 and 4 R_{⊕}, and that these exoplanets should become comparatively rare for fluxes exceeding 100 F_{⊕} due to photoevaporation. In addition, the simulations reveal a corresponding increase in the number of rocky planets with R<1.8 R_{⊕} caused by the presence of the stripped cores. The existence of a paucity in the radius distribution of closein exoplanets caused by evaporation is also supported by other theoretical works^{2,3,4}.
Previous studies have detected a deficit of exoplanets in the radiusperiod (or semi major axis) diagram^{4,5,6}. This socalled subJovian pampas or subJovian desert extends from 3 to 10 R_{⊕} for periods shorter than 2.5 days^{5,6}. However, the absence in the distribution of exoplanets caused by evaporation has escaped a secure confirmation^{2,3,4,5,6,7,8} primarily due to uncertain host star parameters. This can now be changed with asteroseismology. Asteroseismology studies the stellar pulsations, and it allows us to determine the properties of many exoplanet host stars to high accuracy^{9,10,11,12}, which in turn markedly improves the planetary properties.
NASA’s Kepler mission has provided highquality data for thousands of potential exoplanets and their host stars^{7,13,14,15}. Here we exploit these data, using asteroseismology, to make a robust detection of the hotsuperEarth desert, a region in the radiusflux diagram completely void of exoplanets. We find that the hotsuperEarth desert is statistically significant and not caused by selection effects or false positives. The detection of the existence of a hotsuperEarth desert confirms that photoevaporation does play a role in shaping the exoplanet population that we see today. This imposes an important constraint on simulations of the formation and evolution of exoplanetary systems since this effect needs to be taken into account.
Results
The seismic sample of exoplanets
Using asteroseismology, we obtained accurate stellar mean densities and radii for 102 exoplanet host stars (both confirmed and candidate exoplanets). These are shown in an asteroseismic Hertzsprung–Russell Diagram in Fig. 1 (the methods used to determine the parameters are discussed in Methods, while Supplementary Table 1 contains the data). The asteroseismic mean densities and radii, combined with precise periods and transit depths as well as the stellar effective temperature, allowed us to calculate very precise planetary radii and incident fluxes for the subset of Kepler exoplanets that orbit the 102 host stars (typically more precise than 10%, see Fig. 5).
We determined the flux that the exoplanet receives from its host star, using the following expression for the timeaveraged incident flux in units of the Earth value (assuming circular orbits):
Here ρ_{*} is the stellar mean density obtained from asteroseismology, P is the orbital period, and T_{eff} is the effective temperature, with T_{eff,⊙}=5,778 K being the effective temperature of the Sun. To find the radius we used the planet–star radius ratio (R_{p}/R_{*}), which can be obtained from the transit depth (δF/F) and the stellar radius from gridmodelling:
The periods and the planet–star radius ratios have been obtained from ref. 16 (62 exoplanets) or the NASA Exoplanet Archive’s cumulative KOI (Kepler Object of Interest) list (http://exoplanetarchive.ipac.caltech.edu/cgibin/TblView/nphtblView?app=ExoTbls&config=cumulative, accessed on 1 July 2015) with preference given to the former. The uncertainties were estimated using propagation of (Gaussian) uncertainties, where the dominant contribution to the uncertainty on the incident flux stems from the temperature uncertainty.
The hotsuperEarth desert
In Fig. 2, we show the exoplanet radius as a function of the incident flux for 157 of the 162 exoplanets. Five exoplanets were removed from the subsample because their radius estimates had an uncertainty in excess of 20% (in order to not have our sample polluted by bad data points, see Methods for details). For illustration we also show in Fig. 2 all Kepler KOIs with apparent sizes below 30 R_{⊕} determined to better than 20%, that have a calculated flux and are not in our seismic sample (the nonseismic sample, the incident fluxes and radii for these KOIs have been taken from the NASA Exoplanet Archive).
Figure 2 clearly displays a complete absence of exoplanets with sizes between 2.2 and 3.8 R_{⊕} and an incident flux above 650 times the Earth value (shaded area in Fig. 2). We constrained the size of the empty region by bootstrapping with 1 million iterations of the exoplanets present in the seismic subsample and using the boundaries that left the region empty in 95.45% (2σ) of the iterations (see Methods for further information). This empty region in the radiusflux diagram is the hotsuperEarth desert, and its location agrees with the theoretical prediction^{1}. We note that some data points from the nonseismic sample will fall in the region of the hotsuperEarth desert, if no cut is made to weedout uncertain data points (see for example Fig. 7 of ref. 7 and Methods).
As the boundaries and Fig. 2 suggest, we opted to model the hotsuperEarth desert as a simple box region. While such a simplistic model may not capture the full effects of evaporation on the planet population, we do believe it to encompass the main features. A more sophisticated model taking into account how the amount of evaporation scales with incident flux and planet mass could be a next step.
As an aside, it should be pointed out that in the seismic subsample shown in the radiusflux diagram in Fig. 2, KOI4198.01 (which we call Zenta) appears somewhat isolated. If Zenta is a bona fide exoplanet, then it could potentially be a very interesting object, since it has the highest incident flux of the exoplanets in the seismic subsample, and it is below 1 R_{⊕} in size. We have inspected the light curve of Zenta to make sure the transits look like genuine exoplanet transits, and we have obtained a few spectra of the host star with the Nordic Optical Telescope (on La Palma). These spectra will be the subject of a subsequent analysis.
Significance
We employed different techniques to assess the significance of the hotsuperEarth desert. First, under the assumption that the period, or equivalently the incident flux distribution does not change with planet radius^{17} (the null hypothesis), we tested whether the hotsuperEarth desert could occur by chance. This was done by drawing exoplanets randomly from the planet population below 650 F_{⊕} and counting how many exoplanets fell in the radius range of the desert (see Methods for details). We find that only 8 of our 10 million simulations returned zero exoplanets in the desert. Thus, it is very unlikely to observe the desert if the incident flux is not a function of radius, which is in agreement with our observation of a gap in the distribution.
Second, we used a Gaussian mixture model to represent the seismic subsample as it would look with no desert. Here the underlying assumption is that the radiusflux distribution can be described by a sum of lognormal distributions^{18}. From the model we created a histogram (Fig. 3), and we found that fewer than 0.4% of our simulations return the observed number of planets (zero) in the region of the hotsuperEarth desert (see also Methods). This shows that the gap in the radiusflux diagram is significant. It is worth noting that from the nonseismic sample alone, this inference cannot be made (it gives a p value of 8%). We do not believe that this could be due to selection effects between the seismic and the nonseismic sample, since any detected hot superEarth planet would have been a highpriority target to the Kepler mission. From the Gaussian mixture model treatment of the seismic sample, we also found a slight, although not statistically significant, overdensity below the desert (see Fig. 3), similar to that expected if the rocky cores are left over from evaporation^{1}.
Selection effects and false positives
It should be emphasized that there are some selection biases in the sample. The limitations in the detection sensitivity of Kepler are the reason for the lack of small exoplanets with low incident flux. Also, for the asteroseismic subsample, the selected stars were on the Kepler shortcadence target list, which is the reason for the low number of large exoplanets with low incident flux (shortcadence slots were prioritized for multiplanet systems over singleplanet systems, which favours small planets^{19}, and exoplanets showing many transits were discovered early in the mission and kept). While the completeness of the sample is hard to quantify^{14,20}, no known selection effects^{21} would produce the paucity that we observe and the sample is complete down to 2 R_{⊕} for shortperiod exoplanets^{8} (additionally, any missing small planets from below the gap, would only make the desert more pronounced). We also attempted to account for detection biases in our sample by imposing a signaltonoise ratio (SNR) criterion and assuming that no exoplanets meeting that criterion with a radius above 1.4 R_{⊕} would have have been missed^{9}. We found this not to affect the presence of the hotsuperEarth desert (see Fig. 7 and Methods for further details).
Despite our basic vetting, the seismic subsample of exoplanets will contain some false positives (FPs). The overall FPrate for the sample is found to be low^{22} (in particular for the multiplanet systems^{23}), but it does vary over the sample. For example, the FPrate is lower for exoplanets with radii 2–4 R_{⊕} than for those with smaller or larger radii^{24}. However, clearly no FPs have filled the hotsuperEarth desert, and our simulations show we would not be significantly affected by the presence in the sample of the percentage of FPs suggested by ref. 24 (see Methods).
The subJovian pampas
A trend, agreeing with our results, has been seen in the radiusperiod (or semi major axis) diagram by previous studies^{4,5,6}. They detected a deficit of exoplanets in the radiusrange 3–10 R_{⊕} with periods shorter than around 2.5 days (the subJovian pampas or subJovian desert)^{5,6}. While both our hotsuperEarth desert and the subJovian pampas lie at high temperatures (be it high incident flux or short periods), the radiusrange is somewhat different, since we find the hotsuperEarth desert to extend only up to 3.8 R_{⊕}. Therefore, we investigated the radiusrange of the hotsuperEarth desert further to determine whether it could be an extension of the subJovian pampas.
Of the exoplanets in the seismic subsample above our flux boundary of 650 F_{⊕}, four exoplanets are present above 10 R_{⊕}. These are all confirmed exoplanets (Kepler1b, 2b, 7b and 14b), and thus agree with the upper limit set by the previous studies. In the radiusrange between 4 and 10 R_{⊕}, another four exoplanets are present in our seismic sample. Of these, two are confirmed exoplanets (Kepler4b and 56b) and a third (KOI5.01) is a candidate in a multiplanet system (where the FPrate is lower^{23}). Most important to the location of the upper boundary of the hotsuperEarth desert is Kepler4b, which is located at R=4.2 R_{⊕} and F=1243 F_{⊕} (see Fig. 2), and thus effectively sets the upper boundary. Kepler4b has a density of around 1.9 g cm^{−3} (similar to the density of Neptune), and is consequently volatile rich^{25}, which agrees with its location above the desert. Similarly, Kepler56b also has a density estimate consistent with a volatilerich composition^{26}, thus also agreeing with its location above the desert.
We have examined the transits of Kepler4b for evidence that it could be evaporating, but we failed to find any asymmetry in the transits or any transittotransit depth variations, which could both indicate atmospheric loss^{27,28}. Still, it cannot be ruled out that evaporation of Kepler4b could be ongoing at a level, which we cannot detect and possibly at a level that does not influence the radius evolution of the planet.
To assess the significance of an extended gap scenario, we tested two additional sets of boundaries, allowing for the presence of exoplanets from the seismic sample within the gap. The first scenario had the same flux boundary as the hotsuperEarth desert (650 F_{⊕}), but spanned the radiusrange from 2.2 to 10 R_{⊕} in agreement with the upper limit stated for the subJovian pampas. This meant that four seismic exoplanets were present in the tested region (Kepler4b, Kepler56b, KOI5.01 and KOI1314.01). However, since the subJovian pampas was defined in orbital period rather than incident flux, we cannot replicate the exact limit in the radiusflux diagram found by for instance ref. 6. Thus, we also considered the possibility that we should move the boundary to higher incident flux. Therefore, we tested a region with the aforementioned limits in radius, but bounded by an incident flux of 1,000 F_{⊕} instead of 650 F_{⊕}, which only leaves Kepler4b in the region of the gap (even though a flux limit this high does not seem to agree with the sharp cutoff in the seismic sample in the 2.2–3.8 R_{⊕} region). We find that both of the tested scenarios are less significant than the hotsuperEarth desert, with the 650 F_{⊕} radiusextended scenario being by far the least significant one.
It can be noted in connection to the subJovian pampas that some exoplanets seem to occupy that gap^{6,29}, and that they do not all appear to be FPs^{29}. Reference 29^{29} investigates three planet candidates located in the subJovian pampas, and they find that two of the three are likely true planets. While these two planets fall comfortably within the subJovian pampas, one of them is too large to fall in the hotsuperEarth desert, and the other one has uncertainties large enough that it could as well be outside the hotsuperEarth desert (it sits <1σ from the upper radiuslimit^{29}).
Discussion
For exoplanets in the radius range in question, radius is thought to be a good proxy for composition^{21,30}. This allows for the transition from a predominantly rocky to a volatilerich makeup to be expressed in terms of radius, and this transition has been found to occur around 1.6–1.8 R_{⊕} by different studies^{21,30,31}. Thus, the majority of exoplanets in the 2.2–3.8 R_{⊕} range are expected to be volatile rich, though some of them could be water worlds^{21} (for comparison, the radius of Neptune is ∼3.8 R_{⊕}). This agrees with the theory that these exoplanets could be stripped of their envelopes when they are too close to their host star. Thus, we can infer from our hotsuperEarth desert that hot exoplanets below ∼2.2 R_{⊕} most likely have a predominantly rocky composition.
Dynamical interactions may in principle also be responsible for shaping the gap in the radiusflux diagram, for example, due to orbital decay or inward migration of planets at late evolutionary stages. However, it seems unlikely that orbital decay played a major part in clearing out the particular part of parameter space associated with the hotsuperEarth desert since the planets would either need to be more massive or on shorter orbits^{32}. Other migration channels such as a combination of planetplanet scattering, tidal circularisation and the Kozai mechanism could have played a role in shaping the location of the hotsuperEarth desert through migration of exoplanets that were initially part of a triple (or larger) system^{33,34}. These effects have not been considered in our work, but they could be responsible for later migration of some of the planets that sit above the hotsuperEarth desert (and inside the subJovian pampas, such as Kepler4b). In addition, the flux boundary is likely a function of the planet mass with heavier planets being able to better withstand the evaporation. Therefore, while we find that the hotsuperEarth desert is more significant than the other regions we tested, we are not in a position to unambiguously decide whether the hotsuperEarth desert is an extension of the subJovian pampas or a separate feature in the radiusflux diagram.
We have established the existence of a hotsuperEarth desert in the radiusflux diagram. Its presence confirms that photoevaporation plays an important role in planetary evolution, with the massloss history depending on the incident stellar flux. This represents a mechanism not seen in our own solar system, by which some volatilerich exoplanets are stripped of their atmospheres by their host stars. Consequently, our detection of a hotsuperEarth desert will add an important constraint for simulations of the evolution of planetary systems.
Methods
Preparation of the power spectra
Asteroseismology is the study of stellar oscillations. In the case of solarlike stars, the frequencies of the oscillations are almost regularly spaced in a Fourier transform of the time series (a power spectrum, see the inset in Fig. 4). The dominant regular structure yields the large frequency separation, which carries information about the stellar mean density^{35}.
We have searched all 275 exoplanet host stars with a Kepler magnitude brighter than 13.5 and with short cadence Kepler data (sampled every 58.85s) for an asteroseismic signal. A magnitude limit of 13.5 was chosen since we have essentially no chance of detecting oscillations in a solarlike star fainter than this^{36}. To be able to search for the large frequency separation (Δv) for each of the stars, we first made weighted power spectra. The power spectrum for each star was calculated in the following manner: (1) The time series for each quarter (data from Kepler are divided into quarters of ∼3 months duration due to the roll of the spacecraft) was cleaned for bad data points using sigmaclipping (with 4σ) of a highpass filtered time series (highpass filter was 7 min) to take out the effect of all slow variations. (2) Using a highpass filter, the longterm variation of the noise per data point was estimated and taken as the scatter (σ). (3) Using 1/σ^{2} as the statistical weight per data point, we calculated the power spectrum following ref. 37. (4) For each quarter we calculated a separate power spectrum, and subsequently we combined the power spectra for all quarters into one single spectrum using a weighted mean. The weights were given by 1/(median(power))^{2}, where the median of the power between 2 and 4 mHz was used. This serves the purpose of downweighting power spectra for quarters with higher noise levels with respect to the others. Also, when combining several power spectra this way, we change the statistics of the power spectrum from being described by a to approaching a normal distribution (as stated by the central limit theorem)^{38}. An example of a part of a power spectrum can be seen in the inset in Fig. 4.
Extraction of large frequency separations
A clear asteroseismic signature was found in 102 of the host stars using a matched filter response function (MFR)^{39} to search for the large frequency separation. The method takes advantage of the nearregular spacing of the highorder, lowdegree pmodes in the power spectrum of solarlike stars. It does this by summing the smoothed power at specific frequencies, which have been calculated from the asymptotic relation^{40} in the version:
Here n is the radial order of the mode (related to the number of nodes in the radial direction), is the degree of the mode (the number of surface nodes), ɛ is a parameter sensitive to the nearsurface layers of the star, while D_{0} is sensitive to the conditions near the core.
When summing the power at frequencies given by different values of Δv (collapsing over different values of the other parameters in expression (3), the result is the MFR giving the summed power as a function of Δv (see ref. 39 for details). An example for the host star KIC 9414417 can be seen in Fig. 4. The large frequency separation corresponding to the most prominent peak in the MFR is then the large frequency separation of the star. The uncertainty on the large frequency separation is determined as the full width at half maximum of the peak.
Gridmodelling of the host stars
We used four pipelines to determine the stellar parameters for the 102 exoplanet host stars. These were Asteroseismology Made Easy (AME)^{41}, SEEK^{42}, BAyesian STellar Algorithm (BASTA)^{12} and the YaleBirmingham (YB)^{43,44,45} pipeline. The YB pipeline derived the properties from five different grids of stellar models, which brings us to a total of eight different grids of stellar models. These pipelines have been used extensively for asteroseismology^{9,10,11}, and further description of the pipelines can be found in the literature.
As inputs to the gridmodelling we used for each star its large frequency separation (Δv) found from asteroseismology and two spectroscopic inputs; the effective temperature (T_{eff}) and the metallicity ([Fe/H]). The values that were used for the 102 host stars can be found in Supplementary Table 1.
We chose to use the mean density and radius returned by AME and then determined the uncertainty by adding in quadrature the uncertainty returned by AME and the scatter over the values returned by the other seven grids. Three stars were too massive for the AME grid, so for these we used the median parameters from the other seven pipelines and estimated the uncertainties by adding in quadrature the median formal uncertainty and the scatter over all seven grids. We note that the parameters returned by the various pipelines were consistent.
Many of the host stars in our seismic sample are present in other large host star samples with published seismic results^{9,12}, and we have compared the densities and radii obtained for our sample with these other results. We find our parameters to be fully consistent (within 1σ) with the results from ref. 9 with the exception of Kepler22. However, this is due to the fact that we are using a very different large frequency separation, since the signal originally found^{46} is no longer thought to be the correct one (H. Kjeldsen et al. (manuscript in preparation)). When comparing the densities and radii for the host stars that we have in common with ref. 12 (32 stars), we find that all densities and 29 of the 32 radii are consistent within 1σ with the remaining three radii differing by just above 1σ, leading us to conclude that our densities and radii are in agreement with those previously determined.
Vetting of the seismic subsample of exoplanets
To do some basic vetting of our seismic subsample, we chose to limit our sample to exoplanets that had an uncertainty in radius of <20%. A large uncertainty on radius was primarily due to large uncertainties on R_{p}/R_{*}, which can be caused by grazing transits where the planet only partly covers the star. This removed five exoplanets from the sample: KOI371.01, KOIs 2612.01 and 2612.02, KOI3194.01 (which in addition has an impact parameter (b (ref. 47)) larger than unity) and KOI5086.01 (also b>1). A radius cut of 30% would remove three of these targets (it would leave KOIs 2612.01 and 2612.02 in the sample). It should be noted that none of these exoplanets were situated in the hotsuperEarth desert. Instead of limiting our sample by using the uncertainty in radius, we also tried using the impact parameter (with the criterion b<1), which would remove some of the grazing transits. This removed the two exoplanets mentioned above from the asteroseismic subsample, but we opted for the stricter 20% limit on the radius uncertainty.
We also tried to vet the subsample by using asterodensity profiling^{16,48,49,50}. Here the ratio of stellar mean densities derived from the orbit and, in our case, gridmodelling () is considered, and a value very different from unity points to either very eccentric orbits or a blend scenario (these are the two largest effects). However, it was difficult to put meaningful constraints on the density ratio since a conservative value did not eliminate any candidates and a more aggressive value would risk throwing away higheccentricity exoplanets. Thus, we did not pursue this further.
If the cut in radius uncertainty is made at a higher value than 20%, then exoplanets from the nonseismic sample will appear in the desert. We have examined the points that appear if the cut is instead made at 30 or 40%. Using the information from the NASA Exoplanet Archive (from 1 July 2015), when we make the cut at 20% one exoplanet from the nonseismic sample is present in the top of the desert (with its 1σ errorbars easily placing it outside the desert). If we increase this value to 30%, then two additional planets enter the hotsuperEarth desert, one very close to the lower flux boundary, and one which, since we downloaded the data, has been flagged as a FP.
If the cut is instead made at 40%, then a total of 13 exoplanets occupy the region of the hotsuperEarth desert including those discussed above. Of these 13 planets, two are FPs and one is the confirmed exoplanet Kepler319b. However, on checking the radius and flux for Kepler319b listed in the discovery paper^{51}, it is clear that this planet is in fact situated far from the desert (with R=1.63 R_{⊕} and F=261.6 F_{⊕} (ref. 51)), which brings us down to 10 exoplanets in the desert.
We have manually inspected these 10 remaining exoplanets situated in the hotsuperEarth desert. They all orbit stars of spectral type F or G, and we find that the reason for the very uncertain exoplanet parameters is very uncertain parameters for the host stars. We find that all of them have uncertainties consistent with a location outside the desert, and that two of them are likely FPs judged on inconsistency between the stellar density derived from the transit and that derived from the stellar mass and radius (these planets are on short orbits and are thus unlikely to have large eccentricities). It is noteworthy that excluding data points with high uncertainties does not exclude a specific stellar spectral type, for instance, it simply limits the number of bad data points in the sample.
We have plotted histograms of the relative uncertainties on the radius and incident flux, which can be seen in Fig. 5. We note that there is a clear bimodal distribution in both histograms, and that the uncertainties for the seismic subsample are lower than the typical uncertainties in the nonseismic sample. This emphasizes our point that the properties of the seismic sample are determined to a high accuracy. The bimodal distributions in relative uncertainty in flux and radius show that the nonseismic sample is divided into a ‘low’ uncertainty and a ‘high’ uncertainty population, and the division between the two populations lie at ∼30% in radius. Thus, making a cut at 20% should ensure that we are only plotting the best data points from the nonseismic sample, and we have verified that we are not cutting away a population of planets around Mdwarfs (which would have high uncertainties in radius) by doing so.
Determining the boundaries of the hotsuperEarth desert
We constrained the size of the hotsuperEarth desert by doing a bootstrap with 1 million iterations of the exoplanets present in the seismic subsample and using the boundaries that make 95.45% (2σ) of the iterations return an empty desert. To be specific, we first randomly drew 157 exoplanets with replacement from the seismic subsample. Then we assigned each of these a radius and a flux randomly selected from Gaussians centred on the parameters for the drawn exoplanet with a standard deviation equal to the uncertainty. Subsequently, we determined how many of these exoplanets that were situated in the hotsuperEarth desert. This was repeated 1 million times, after which we calculated the percentage of iterations without planets in the hotsuperEarth desert (which is the observed number). We used this information to change the boundaries of the desert, and we repeated the above procedure until we had obtained the 2σ limits. This procedure does not yield unique boundaries, although they are well constrained due to the small uncertainties on the exoplanets in the sample. However, to determine the exact extent of the hotsuperEarth desert is beyond the scope of this work, and it will in addition depend on whether or not one will allow any exoplanets in the desert.
The Gaussian mixture model
We have used a Gaussian mixture model (GMM), which is a probabilistic model that is the sum of a finite number of Gaussian distributions (we used the Python ScikitLearn Gaussian Mixture Model^{52}). We used the GMM to describe the planet population in log–log radiusflux space and then applied tests to the model to assess the probability that we had detected the hotsuperEarth desert. The distribution of planets in flux and radius is expected to form a correlated lognormal distribution as an outcome of a stochastic planet formation process that produced many correlated, fractional changes in planet sizes and orbits^{18}. Thus, it is justified to use the GMM, which fits a sum of bivariate Gaussians to the data.
The two different hypotheses that we tested using the GMM and the data are the null hypothesis and the irradiated hypothesis. The null hypothesis states that the radiusflux distribution is smooth, thus that there is no hotsuperEarth desert present in the data. The irradiated hypothesis states that there is a gap in the population density and that there is an overdensity at radii lower than the gap.
We leave the number of summed normal distributions as a parameter to be determined by the data in order to allow for different formation processes, selection effects and other biases. The number of Gaussian components is determined by selecting the model with the lowest Bayesian Information Criterion (BIC). We apply the fit to three different samples; the seismic subsample of exoplanets, the nonseismic subsample and the combined sample. For each sample we use the minimum BIC to determine the number of components used in the GMM. For the seismic subsample, the typical number of components selected by the BIC is one.
The fit applied by the GMM does not treat statistical uncertainties on the data points. To ensure our tests are robust we have used a Monte Carlo approach to draw each data point from its statistical uncertainties. We generate 1,000 draws from the uncertainties and for each draw we fit the GMM, and each time we determine the number of components by selecting the lowest BIC. From each of these 1,000 models, we draw 5,000 populations and record the number of planets that occupy the gap for each. This then provides the probability distribution of planets in the gap under the null hypothesis, since we fit our model to the data under this assumption.
Figure 6 shows examples of the real data together with simulated samples drawn from the fit. We artificially injected a hotsuperEarth desert (2.2≤R_{p}/R_{⊕}≤3.8 and F≥650 F_{⊕}) into the drawn samples by subtracting 2.7 R_{⊕} from the planetary radius if the planet fell within the desert. While somewhat crude, this introduces the gap and an overdensity below the gap.
In the seismic sample, no planets are observed in the hotsuperEarth desert. Figure 3 shows the probability distribution of planets expected in the hotsuperEarth desert under the null hypothesis together with the observed value (0±0.04). The uncertainty comes from the small chance that a system actually occupies the gap due to the uncertainties on the planetary radius and flux. Furthermore, we show the expected population distribution below the desert (0.4≤R_{p}/R_{⊕}≤2.2 and F≥650 F_{⊕}) also with the observed value (17±0.7). The probability of observing no planets in the hotsuperEarth desert in the seismic sample given the null hypothesis is p=0.4%, which is sufficiently small that we reject the null hypothesis. We observe a slight overdensity in the planet population below the desert, but this is very weak and not statistically significant.
We repeat the analysis with the nonseismic and the combined samples. For the nonseismic sample we find the probability of observing the data in the desert under the null hypothesis is p=8%, which supports the rejection of the null hypothesis but is not significant under the typical requirements of either p<5% or p<1%. For the combined data we find a small improvement with p=0.3%, which is clearly dominated by the seismic sample.
We checked our method using the simulated data with and without a gap. We found results that were consistent with those reported here for the real data. It should be noted, specifically, that in the simulatedgap seismic sample, we consistently found we could reject the null hypothesis of no gap, while we typically did not confirm an overdensity below the gap.
Debiasing the seismic subsample
In an attempt to account for detection biases in our seismic subsample, we debiased the sample following the approach described by ref. 9. First, we determined the minimum planetary radius that should be detectable for a given host star^{8}:
Here σ_{CDPP} is the 6 h Combined Differential Photometry Precision^{53}, SNR_{lim} the required signaltonoise ratio (SNR), n_{tr} the number of observed transits and t_{dur} the duration of a transit. We chose a SNR threshold of 10 (ref. 8), and for each exoplanet in the sample we estimated R_{min} by using the median 6 h σ_{CDPP} over all observed quarters (obtained from the Mikulski Archive for Space Telescopes, MAST, https://archive.stsci.edu/kepler/, accessed on 8 July 2015), the transit durations from NASA’s Exoplanet Archive’s cumulative KOI list (http://exoplanetarchive.ipac.caltech.edu/cgibin/TblView/nphtblView?app=ExoTblsconfig=cumulative, accessed on 8 July 2015), and crudely estimating the number of observed transits by dividing the total lifetime of Kepler (around 1470 days) by the period of the exoplanet.
After having calculated R_{min} for all exoplanets in the seismic sample, we found for a range of exoplanet radii (R_{x}) the number of exoplanets fulfilling the inequality R_{min}<R_{x}<R_{p}. Finally, our debiased sample consists of the exoplanets that fulfil the inequality R_{min}<1.4 R_{⊕}<R_{p}, where 1.4 R_{⊕} was the value of R_{x} that returned the maximum number of exoplanets. The debiased sample can be seen in Fig. 7 along with the debiased nonseismic sample (for illustration) using the R_{x} determined from the seismic subsample. It can be seen that the desert is still evident (also, see below).
Further tests
As well as using the Gaussian mixture model to assess the hotsuperEarth desert, we also took another approach to verify the significance of the missing data points. We did this by first dividing the seismic subsample of exoplanets in two groups, with F/F_{⊕}>650 and F/F_{⊕}<650. We then used the exoplanet radius distribution for the F/F_{⊕}<650 sample to generate a sample with randomly selected exoplanet radii for a number of exoplanets corresponding to the number of exoplanets with F/F_{⊕}>650. Afterwards we determined how many of the selected exoplanets had a radius between 2.2 and 3.8 R_{⊕}. This simulation was repeated 10^{7} times, and we determined the number of times we found the same number of exoplanets as we observe in the hotsuperEarth desert (zero), in this radius range. We found this to happen in only 8 of the 10 million simulations.
This test was repeated to measure the effect of FPs on the detection. This was done by randomly removing points from the seismic subsample according to the percentages given by ref. 24 and then repeating the above analysis. Of course this approach does not take into consideration any nonuniformity of the FPrate with flux (for instance there are more eclipsing binaries at higher incident flux^{23}), but on the other hand we do not compensate for the FPrate for multiplanet systems being lower than for singleplanet systems^{23}, or that we have many confirmed exoplanets in the sample (for which the FPrate should be essentially zero). Thus, this approach should give a fairly conservative estimate of the effect of FPs on the hotsuperEarth desert. We find that 39 of our 10 million simulations return zero exoplanets in the radius region of the desert, meaning that the presence of FPs in our sample does not significantly affect the detection of the hotsuperEarth desert.
To assess the importance of potential systematic errors on the detection of the desert, we investigated the impact on the incident flux of a 100 K temperature offset and also of a nonzero eccentricity of e=0.5. We find that the effect of both of these changes is of the same magnitude, and using this test we determined that they have no impact on the detection of the hotsuperEarth desert.
We also used the test on the debiased sample (seen in Fig. 7), and here 54 of the 10 million simulations returned the same number of exoplanets in the desert as we observe; thus the detection of the hotsuperEarth desert is not greatly changed by using the debiased sample instead. We performed a bootstrap on the debiased sample similarly to what was done for the full sample. Here we found that the boundaries of the hotsuperEarth desert that we determined from bootstrapping the full sample (2.2<R_{p}/R_{⊕}<3.8 and F>650 F_{⊕}) are stronger when considering the debiased sample. They change from being 2σ limits to being just above 3.5σ, meaning that >99.95% of the 1 million simulations left the hotsuperEarth desert empty (380 yielded one planet).
Additional information
How to cite this article: Lundkvist, M.S. et al. Hot superEarths stripped by their host stars. Nat. Commun. 7:11201 doi: 10.1038/ncomms11201 (2016).
References
 1
Lopez, E. D. & Fortney, J. J. The role of core mass in controlling evaporation: the Kepler radius distribution and the Kepler36 density dichotomy. Astrophys. J. 776, 2 (2013).
 2
Owen, J. E. & Wu, Y. Kepler planets: a tale of evaporation. Astrophys. J. 775, 105 (2013).
 3
Owen, J. E. & Wu, Y. Atmospheres of lowmass planets: the ''Boiloff''. Astrophys. J. 817, 107 (2016).
 4
Kurokawa, H. & Nakamoto, T. Massloss evolution of closein exoplanets: evaporation of hot jupiters and the effect on population. Astrophys. J. 783, 54 (2014).
 5
Szabó, G. M. & Kiss, L. L. A shortperiod censor of subjupiter mass exoplanets with low density. Astrophys. J. 727, L44 (2011).
 6
Beaugé, C. & Nesvorný, D. Emerging trends in a periodradius distribution of closein planets. Astrophys. J. 763, 12 (2013).
 7
Rowe, J. F. et al. Planetary candidates observed by Kepler. V. Planet sample from Q1Q12 (36 months). Astrophys. J. Suppl. 217, 16 (2015).
 8
Howard, A. W. et al. Planet occurrence within 0.25AU of solartype stars from Kepler. Astrophys. J. Suppl. 201, 15 (2012).
 9
Huber, D. et al. Fundamental properties of Kepler planetcandidate host stars using asteroseismology. Astrophys. J. 767, 127 (2013).
 10
Chaplin, W. J. et al. Asteroseismic fundamental properties of solartype stars observed by the NASA Kepler mission. Astrophys. J. Suppl. 210, 1 (2014).
 11
Campante, T. L. et al. An ancient extrasolar system with five subEarthsize planets. Astrophys. J. 799, 170 (2015).
 12
Silva Aguirre, V. et al. Ages and fundamental properties of Kepler exoplanet host stars from asteroseismology. Mon. Not. R. Astron. Soc. 452, 2127–2148 (2015).
 13
Borucki, W. J. et al. Characteristics of planetary candidates observed by Kepler. II. Analysis of the first four months of data. Astrophys. J. 736, 19 (2011).
 14
Batalha, N. M. et al. Planetary candidates observed by Kepler. III. Analysis of the first 16 months of data. Astrophys. J. Suppl. 204, 24 (2013).
 15
Mullally, F. et al. Planetary candidates observed by Kepler. VI. Planet sample from Q1Q16 (47 months). Astrophys. J. Suppl. 217, 31 (2015).
 16
Van Eylen, V. & Albrecht, S. Eccentricity from transit photometry: small planets in Kepler multiplanet systems have low eccentricities. Astrophys. J. 808, 126 (2015).
 17
Morton, T. D. & Swift, J. The radius distribution of planets around cool stars. Astrophys. J. 791, 10 (2014).
 18
Farr, W. M., Mandel, I., Aldridge, C. & Stroud, K. The occurrence of earthlike planets around other stars. Preprint at http://arxiv.org/abs/1412.4849 (2014).
 19
Latham, D. W. et al. A first comparison of Kepler Planet candidates in single and multiple systems. Astrophys. J. 732, L24 (2011).
 20
Petigura, E. A., Marcy, G. W. & Howard, A. W. A plateau in the planet population below twice the size of Earth. Astrophys. J. 770, 69 (2013).
 21
Wolfgang, A. & Lopez, E. How rocky are they? The composition distribution of Kepler's subneptune planet candidates within 0.15AU. Astrophys. J. 806, 183 (2015).
 22
Désert, J.M. et al. Low false positive rate of Kepler candidates estimated from a combination of spitzer and followup observations. Astrophys. J. 804, 59 (2015).
 23
Lissauer, J. J. et al. Validation of Kepler's multiple planet candidates. II. Refined statistical framework and descriptions of systems of special interest. Astrophys. J. 784, 44 (2014).
 24
Fressin, F. et al. The false positive rate of Kepler and the occurrence of planets. Astrophys. J. 766, 81 (2013).
 25
Borucki, W. J. et al. Kepler4b: a hot neptunelike planet of a G0 star near mainsequence turnoff. Astrophys. J. 713, L126–L130 (2010).
 26
Huber, D. et al. Stellar spinorbit misalignment in a multiplanet system. Science 342, 331–334 (2013).
 27
Rappaport, S. et al. Possible disintegrating shortperiod supermercury orbiting KIC 12557548. Astrophys. J. 752, 1 (2012).
 28
Rappaport, S. et al. KOI2700b—a planet candidate with dusty effluents on a 22 hr orbit. Astrophys. J. 784, 40 (2014).
 29
Colón, K. D., Morehead, R. C. & Ford, E. B. Vetting Kepler planet candidates in the subJovian desert with multiband photometry. Mon. Not. R. Astron. Soc. 452, 3001–3009 (2015).
 30
Lopez, E. D. & Fortney, J. J. Understanding the massradius relation for subneptunes: radius as a proxy for composition. Astrophys. J. 792, 1 (2014).
 31
Rogers, L. A. Most 1.6 Earthradius planets are not rocky. Astrophys. J. 801, 41 (2015).
 32
Essick, R. & Weinberg, N. N. Orbital decay of hot jupiters due to nonlinear tidal dissipation within solartype hosts. Astrophys. J. 816, 18 (2016).
 33
Fabrycky, D. & Tremaine, S. Shrinking binary and planetary orbits by Kozai cycles with tidal friction. Astrophys. J. 669, 1298–1315 (2007).
 34
Nagasawa, M., Ida, S. & Bessho, T. Formation of hot planets by a combination of planet scattering, tidal circularization, and the Kozai mechanism. Astrophys. J. 678, 498–508 (2008).
 35
Chaplin, W. J. & Miglio, A. Asteroseismology of solartype and redgiant stars. Annu. Rev. Astron. Astrophys. 51, 353–392 (2013).
 36
Chaplin, W. J. et al. Predicting the detectability of oscillations in solartype stars observed by Kepler. Astrophys. J. 732, 54 (2011).
 37
Frandsen, S. et al. CCD photometry of the δScuti star κ^{2} bootis. Astron. Astrophys. 301, 123 (1995).
 38
Appourchaux, T. O. n. Maximum likelihood estimation of averaged power spectra. Astron. Astrophys. 412, 903–904 (2003).
 39
Gilliland, R. L. et al. Asteroseismology of the transiting exoplanet host HD 17156 with hubble space telescope fine guidance sensor. Astrophys. J. 726, 2 (2011).
 40
Tassoul, M. Asymptotic approximations for stellar nonradial pulsations. Astrophys. J. Suppl. 43, 469–490 (1980).
 41
Lundkvist, M., Kjeldsen, H. & Silva Aguirre, V. AME—Asteroseismology made easy. Estimating stellar properties by using scaled models. Astron. Astrophys. 566, A82 (2014).
 42
Quirion, P.O., ChristensenDalsgaard, J. & Arentoft, T. Automatic determination of stellar parameters via asteroseismology of stochastically oscillating stars: comparison with direct measurements. Astrophys. J. 725, 2176–2189 (2010).
 43
Basu, S., Chaplin, W. J. & Elsworth, Y. Determination of stellar radii from asteroseismic data. Astrophys. J. 710, 1596–1609 (2010).
 44
Gai, N., Basu, S., Chaplin, W. J. & Elsworth, Y. An indepth study of gridbased asteroseismic analysis. Astrophys. J. 730, 63 (2011).
 45
Basu, S., Verner, G. A., Chaplin, W. J. & Elsworth, Y. Effect of uncertainties in stellar model parameters on estimated masses and radii of single stars. Astrophys. J. 746, 76 (2012).
 46
Borucki, W. J. et al. Kepler22b: A 2.4 Earthradius planet in the habitable zone of a sunlike star. Astrophys. J. 745, 120 (2012).
 47
Seager, S. & MallénOrnelas, G. A unique solution of planet and star parameters from an extrasolar planet transit light curve. Astrophys. J. 585, 1038–1055 (2003).
 48
Tingley, B., Bonomo, A. S. & Deeg, H. J. Using stellar densities to evaluate transiting exoplanetary candidates. Astrophys. J. 726, 112 (2011).
 49
Kipping, D. M. Characterizing distant worlds with asterodensity profiling. Mon. Not. R. Astron. Soc. 440, 2164–2184 (2014).
 50
Sliski, D. H. & Kipping, D. M. A high false positive rate for Kepler planetary candidates of giant stars using asterodensity profiling. Astrophys. J. 788, 148 (2014).
 51
Rowe, J. F. et al. Validation of Kepler's multiple planet candidates. III. Light curve analysis and announcement of hundreds of new multiplanet systems. Astrophys. J. 784, 45 (2014).
 52
Pedregosa, F. et al. Scikitlearn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
 53
Christiansen, J. L. et al. The derivation, properties, and value of Kepler''s combined differential photometric precision. Publ. Astron. Soc. Pac. 124, 1279–1287 (2012).
Acknowledgements
Funding for the Stellar Astrophysics Centre is provided by The Danish National Research Foundation (Grant agreement no.: DNRF106). The research is supported by the ASTERISK project (ASTERoseismic Investigations with SONG and Kepler) funded by the European Research Council (Grant agreement no.: 267864). This research has made use of NASA's Astrophysics Data System and the NASA Exoplanet Archive, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program.
Author information
Affiliations
Contributions
M.S.L. led the work, did grid modelling, computed the exoplanet parameters, did the bootstrapping, made the debiased sample and wrote the manuscript. H.K. determined the large frequency separations and performed simulations to assess the significance of the hotsuperEarth desert. M.S.L. and H.K. also designed the project and inspected the exoplanets in the nonseismic sample that fell in the hotsuperEarth desert. S.A. provided feedback on the radiusflux diagram and helped with the structure of the manuscript. G.R.D. ran the Gaussian mixture model and helped with the manuscript. V.V.E. gave feedback on the radiusflux diagram and checked the light curve of KOI 4198.01. D.H. helped with the structure of the manuscript. S.B., C.K. and V.S.A. did gridmodelling. C.V. did an independent analysis of some of the Kepler data. A.B.J. investigated the light curve of Kepler4b. All authors participated in the interpretation of the results and commented on the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Information
Supplementary Table 1 and Supplementary References. (PDF 92 kb)
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Cite this article
Lundkvist, M., Kjeldsen, H., Albrecht, S. et al. Hot superEarths stripped by their host stars. Nat Commun 7, 11201 (2016). https://doi.org/10.1038/ncomms11201
Received:
Accepted:
Published:
Further reading

GeminiGRACES highquality spectra of Kepler evolved stars with transiting planets
Astronomy & Astrophysics (2020)

Limits on the Spin–Orbit Angle and Atmospheric Escape for the 22 Myr Old Planet AU Mic b
The Astrophysical Journal (2020)

Coupled Thermal and Compositional Evolution of Photoevaporating Planet Envelopes
The Astrophysical Journal (2020)

An ultrashort period rocky superEarth orbiting the G2star HD 80653
Astronomy & Astrophysics (2020)

MOVES III. Simultaneous Xray and ultraviolet observations unveiling the variable environment of the hot Jupiter HD 189733b
Monthly Notices of the Royal Astronomical Society (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.