Prediction of the strength and timing of sunspot cycle 25 reveal decadal-scale space environmental conditions

The Sun’s activity cycle governs the radiation, particle and magnetic flux in the heliosphere creating hazardous space weather. Decadal-scale variations define space climate and force the Earth’s atmosphere. However, predicting the solar cycle is challenging. Current understanding indicates a short window for prediction best achieved at previous cycle minima. Utilizing magnetic field evolution models for the Sun’s surface and interior we perform the first century-scale, data-driven simulations of solar activity and present a scheme for extending the prediction window to a decade. Our ensemble forecast indicates cycle 25 would be similar or slightly stronger than the current cycle and peak around 2024. Sunspot cycle 25 may thus reverse the substantial weakening trend in solar activity which has led to speculation of an imminent Maunder-like grand minimum and cooling global climate. Our simulations demonstrate fluctuation in the tilt angle distribution of sunspots is the dominant mechanism responsible for solar cycle variability.

U nderstanding magnetic field generation in the Sun and stars is an outstanding challenge in astrophysics. Theoretical advances in the context of the solar cycle provide a window to the magnetic Universe on the one hand, and on the other, benefits the quest for predicting space weather and climate. The 11-year cycle of sunspots spawn severe space weather characterized by solar flares, coronal mass ejections, geomagnetic storms, enhanced radiative, and energetic particle flux endangering satellites, global communication systems, air-traffic over polar routes, and electric power grids 1 . Protection of planetary technologies and space situational awareness is therefore enabled by solar activity predictions. Slow long-term changes in the Sun's radiative energy output-which is governed by its magnetic activity-is the primary (external) natural driver of planetary atmospheric dynamics, including climate. Assessment of the strength of future sunspot cycles and its expected radiative output provides critical inputs to climate assessment models 2, 3 .
Sunspots have been observed for over four centuries, constituting the longest running, continuous time series of any natural phenomena in the Universe. However, the fact that they are magnetic in nature and their time-variation a manifestation of an underlying magnetic cycle, has only been known since the beginning of 20th century when G.E. Hale and his collaborators discovered that sunspots are strongly magnetized 4 . They also discovered that sunspots typically appear in pairs with a leading (in the direction of solar rotation) and a following polarity of opposite magnetic signs 5 . For an individual sunspot cycle, the leading polarities of these bipolar magnetic regions (BMRs) have opposite signs in the two hemispheres. This relative polarity orientation flips from one sunspot cycle to another generating a 22-year magnetic cycle. The amplitude of the sunspot cycle itself -which determines its space weather consequences-is highly variable and difficult to predict.
The sunspot cycle is understood to originate via a magnetohydrodynamic (MHD) dynamo mechanism 6 involving complex interactions between plasma flows and magnetic fields in the solar convection zone (SCZ). Differential rotation in the solar interior stretches the large-scale poloidal component of the Sun's magnetic field in the ϕ-direction to produce the toroidal component 7 . Strong toroidal flux tubes rise through the SCZ due to magnetic buoyancy and appear as BMRs on the Sun's surface-the photosphere. These BMRs are tilted because of the action of the Coriolis force on rising magnetic flux tubes and there is a dispersion observed around the mean tilt which is thought to be due to the random buffeting of the rising flux tubes by turbulent convection. It is now believed that the dispersal and decay of these tilted BMRs-facilitated by surface flux transport processes -is the predominant mechanism for the regeneration of the Sun's poloidal component [8][9][10][11] . The latter in turn seeds the generation of the next cycle toroidal component, thereby, sustaining the solar magnetic cycle. An alternate mechanism termed as the mean-field α-effect driven by the action of helical turbulent convection on weaker toroidal fields is also thought to contribute to the poloidal field creation process 6 . Fluctuations in these poloidal field creation mechanism(s)-due to the turbulent nature of the SCZ-are likely candidates for governing solar activity variations.
Appropriately constrained computational solar dynamo models, driven by observations, are expected to serve as useful tools for forecasting the sunspot cycle. However, this has remained a challenging task and it has been argued that long-term solar cycle forecasts are not possible 12 . Indeed multiple forecasts were made for the current solar cycle 24 with little consensus 13 and two solar dynamo-based forecasts for cycle 24 differed significantly from each other [14][15][16] . In this backdrop, recent progress focussed on understanding the physics of solar cycle predictability has underscored the importance of plasma flux transport processes in governing the underlying dynamical memory leading to solar cycle predictability, reconciled the difference between diverging dynamo-based forecasts for the current cycle and indicated that the predictive window based on dynamo models alone, is, in fact, short [17][18][19] . These and other studies [20][21][22][23] indicate that prediction of the strength of the next sunspot cycle is indeed plausible, and is best achieved with accurate knowledge (i.e., observational input) of the solar polar (poloidal) field proxy at the preceding cycle minimum, i.e., only about 5 years in advance.
Can we extend this prediction window further? Here we demonstrate that this is viable. We devise a novel methodology, wherein, we first predict the strength of the Sun's polar field at cycle minimum (in advance) and then utilize this as input in a predictive dynamo model to forecast the strength and timing of the next sunspot cycle thereby extending the prediction window to close to a decade.
The transport and dissipation of photospheric magnetic fields that lead to solar polar field reversal and build-up-peaking at the solar minimum-is a complex process. The large-scale behavior of the surface magnetic field was first explained by Leighton 9 who suggested that the magnetic field associated with BMRs diffuses due to a random-walk-like movement of the supergranular convective cells. This diffusion results in flux cancelation along the equator between the leading polarities of BMRs belonging to different hemispheres. As an outcome, there is an imbalance of signed magnetic flux in each solar hemisphere. This excess flux from the following polarities eventually migrate towards the poles and cancels and reverses the old solar cycle polarity. This polar field is, in fact, the radial component of the Sun's poloidal field. Differential rotation and a large-scale flow of plasma from the Sun's equator to the poles known as meridional circulation play crucial roles in this process. This understanding has led to the development of solar surface flux transport (SFT) models which can reasonably simulate the surface dynamics of solar magnetic fields [24][25][26][27][28][29][30] . Earlier simulations with such models indicate a polar field strength at cycle 24 minimum which is weaker or comparable to the previous cycle minimum 31,32 .
We have developed a data-driven SFT model to capture solar surface magnetic field dynamics over the last century. For a description of this model see the Methods section. We extract the polar field information from this SFT model at every cycle minimum and utilize this as an input in a solar dynamo model to simulate the century-scale evolution of the sunspot-forming toroidal field component. The century-scale calibrated simulation, which is able to successfully reproduce solar activity over the past century is then used to predict the maximum (strength) of sunspot cycle 25 and its timing. Furthermore, we perform ensemble runs with the expected level of fluctuations in the governing parameters of the solar cycle to generate a predicted range for cycle 25.

Results
Data-driven century-scale surface flux transport simulations. Using the observed BMR emergence statistics we perform a continuous century-scale, data-driven simulation with our SFT model covering the period 1913-2016, i.e., from solar cycle 15 to the current cycle 24. The most important observed BMR statistics pertain to the flux, tilt angle, location and timing of the emergence of BMRs on the solar surface (details are available in the Methods section).
In Fig. 1, we plot the longitudinally averaged radial magnetic field B r ðR ; λ; ϕ; tÞ as a function of latitude and time to generate the solar butterfly diagram corresponding to our simulation. Surface flux transport dynamics leading to solar polar field reversal and build-up is clearly evident. The top panel of Fig. 2a depicts the time evolution of total unsigned flux [Φ k , calculated by using Eq. (8) in Methods section] associated with the observed sunspot emergence statistics on the solar surface, which is used as input to drive the SFT simulation.
Quantitative comparison between our simulated polar field and observations is achieved through estimates of the simulated polar flux. We calculate the simulated polar flux Φ N=S p by integrating the radial magnetic field around the polar cap (extending from ± 70°to ±90°) in both hemispheres [see Eq. (9) in Methods section]. During most of the period covered by the simulation, however, we do not have any data on how the magnetic field was spatially distributed-precluding a direct estimate of the observed polarcap flux. Observational magnetogram data is only available from 1975 onwards. Therefore, we rely on polar flux measurements obtained from MWO calibrated polar faculae data from 1906 onward 33 for comparing our simulations to observation. The MWO polar flux data ends in 2014.5. Therefore we utilize polar field data provided by the Wilcox Solar Observatory (WSO) for the period beyond 2014.5.
The century-scale simulation has to be initiated with a dipole moment. Given the unavailability of historical observation of dipole moment and the uncertainty in the polar faculae measurement, we vary the initial dipole moment strength by ±25% and conduct multiple (continuous) 100 years runs. We select the simulation for which the correlation between the observed and simulated polar flux is maximum as our calibrated, standard simulation. Figure 2b compares the time evolution of the simulated Φ N=S p and observed polar flux for this calibrated simulation. As expected, the choice of the (arbitrary) initial magnetic field affects the polar flux generated early in the simulation (resulting in a disparity with observations at the end of cycle 15). Otherwise, the simulated polar flux is in good agreement with observations (within error bars) for most subsequent cycles. Linear correlation analysis between the simulated and observed polar flux amplitude of the two solar hemispheres at cycle minima gives a Pearson's correlation coefficient of 0.88 at a 99.99% confidence level. We note that over the past century, the only significant anomaly between observations and our simulation-based reconstruction is for the minimum of cycle 18. Exclusion of the northern and southern hemispheric polar flux values at cycle 18 minimum results in a Pearson's correlation coefficient of 0.95 (with 99.99% confidence level). This analysis is based on simulations up to September 2016 until which period observed sunspot emergence statistics were utilized to drive our simulations.
To forward run the SFT model from the epoch when observed sunspot data inputs were stopped (i.e., September 2016, marked by the vertical dashed line in Fig. 2), synthetic input profiles are used to model the decaying phase of cycle 24 up to the end of 2019. We rely on various statistical properties of sunspots for modeling the synthetic profile to represent the plausible solar activity till the expected minimum (see Methods section for a detailed description). Once a synthetic profile is constructed, we simulate the last 3.25 years of cycle 24 using this input profile. The solid blue line in Fig. 2a represents a synthetic input profile that best fits the preceding phase of cycle 24. We use the calibrated simulation to forward run the model to predict the future evolution of the Sun's polar flux utilizing this synthetic input. We obtain a predicted polar flux value of 6.91 × 10 21 maxwells in the northern hemisphere and −8.73 × 10 21 maxwells in the southern hemisphere at the end of cycle 24.
A comparison of our predicted polar flux at cycle 24 minimum relative to previous cycles presented in Fig. 2b indicates that the south polar flux at the upcoming minimum of cycle 24 is likely to be stronger than the previous minimum while the north polar flux may not be significantly different from the previous minimum.
Ensemble forecast of the Sun's polar field. Towards generating an ensemble forecast to ascertain the range (uncertainty) around our standard run based polar field prediction, we first simulate the decaying phase of cycle 24 with thirty-three other realizations of the sunspot input profile (represented by the set of green curves in Fig. 2a). Among these synthetic profiles, 24 profiles are constructed based on varying the amount of total flux associated with the cycle by ±30% around the mean. Five additional  We note that deviations between the simulated and observed polar flux may arise due to assumptions that are necessary in order to perform such long-term simulations, especially where observational constraints are limited. We have assumed all sunspots appearing on the solar surface follow mean statistical properties and are ideal BMRs (which fall under β-type configuration of sunspots). However, other types of magnetic configurations are possible: α, γ, δ, or a combination of these. We also do not incorporate any scatter in the tilt angle distribution of sunspots in our model. Detailed studies of BMR tilt angle distribution 10,34,35 have established Joy's law, but also found a scatter of individual tilt angles about the mean. Studies of the effect of tilt angle scatter on the polar field and dipole moment evolution 36,37 have established that large sunspots with a large scatter in tilt angle leave a notable imprint on polar field amplitude. Furthermore, a large individual sunspot with an orientation that is opposite to what is expected in a particular hemisphere for a particular cycle (i.e., a non-Hale region) appearing at lower latitudes can cause a significant decrease in polar field strength 38 . Fluctuations in the meridional circulation may also impact polar field amplitude 39 .
To further examine the impact of these irregularities in the sunspot cycle on our prediction range we consider some probable scenarios in keeping with the philosophy of our ensemble forecast. To explore the effect of the occurrence of non-Hale regions in the last 3.25 years of cycle 24, we introduce 10-20 non-Hale BMRs, which constitute about 3-6% of the total flux associated with the input profile. This results in a 3-5% decrease in the final polar flux value calculated during cycle 24 minimum. Introducing ±30% fluctuations in the peak speed of the meridional circulation during the last 3.25 years of cycle 24 results in ±6.5% (on average) variation in the polar field generated at cycle minimum. We further explore the impact of introducing  Fig. 2b). We note, therefore, that the range of variation induced by incorporating tilt angle fluctuations exceed (and subsume) variations due to other processes. This extensive parameter dependence study indicates the robustness of our simulations, and the ensemble forecast provides a predicted range of solar polar field at the end of cycle 24 minimum.
Prediction of solar cycle 25 using a dynamo model. We obtain a strong correlation between the simulated polar flux (averaged over two hemispheres) at cycle minimum and the amplitude of the next cycle. We obtain a Pearson's linear correlation coefficient of 0.84 with a confidence level of 99.10%. This reiterates that the previous cycle polar flux is the best proxy for predicting the sunspot cycle. The connection between these two quantities can be explained by solar dynamo theory. The polar field originates from the poloidal component of the Sun's magnetic field whereas the sunspot cycle amplitude is governed by the strength of the toroidal component of the magnetic field. The latter, however, is generated by stretching of the poloidal component by differential rotation. Thus, the initial poloidal field seed of cycle (n − 1, say) (of which the polar field is a proxy) directly governs the strength of the toroidal field of cycle (n)-which in turn governs the strength of the associated sunspot cycle (n). Therefore, we utilize the simulated polar field from the SFT model and its corresponding poloidal field at cycle minima in a dynamo model of the solar interior to predict the amplitude of the next cycle toroidal field and hence the strength of the associated sunspot cycle.
From our data-driven SFT simulation, we first calculate the surface magnetic field (averaged over ϕ) during the minimum of each solar cycle starting from cycle 16 minimum (1934). This surface magnetic field map is assimilated (at the corresponding cycle minima) in simulations with a kinematic, axisymmetric dynamo model 40 (following calibration and processing as detailed in the Methods section). This century-scale dynamo simulation is forward run in a predictive mode to simulate solar cycle 25 with regular "correction" of the poloidal component at every cycle minimum. We note that our method of utilizing the SFT output as an input in a dynamo model only at cycle minima is distinct from a fully coupled SFT-dynamo simulation 41 and herein, the methodology is devised for predictive purposes utilizing the understanding that has been recently established [17][18][19] . In Fig. 3a, an output from our SFT simulation is presented which depicts the predicted surface distribution of the magnetic field during cycle 24 minimum. The left-hand panel in Fig. 3b represents the magnetic vector potential (corresponding to the poloidal field) in the dynamo model 20 days after the SFT derived vector potential is assimilated into the dynamo model. Using this vector potential as an input in the dynamo model, and running this forward in time, we generate the predicted shape, strength, and timing of sunspot cycle 25. Figure 3b represents the distribution of the toroidal magnetic field in the Sun's convection zone at the maximum of sunspot cycle 25.
This multi-cycle continuous solar dynamo simulation with assimilation of poloidal field maps from the data-driven surface flux transport simulation (at cycle minima) is used to simulate the sunspot cycle over century scale. The cycle strength is determined by the dynamo simulated toroidal flux eruption based on an inbuilt buoyancy algorithm which models sunspot eruptions 40,42,43 . This is compared to sunspot cycle observations in Fig. 4 (following an appropriate multi-cycle calibration as detailed in the Methods section). We recover a good correlation between the yearly averaged simulated and observed sunspot cycle amplitudes. We obtain a Pearson's linear correlation coefficient of 0.87 with a confidence level of 99.54%; exclusion of cycle 19 from the correlation analysis (whose deviation from observations is a result of the mismatch between simulated and observed polar field at cycle 18 minimum) generates an improved correlation of 0.98 with 99.99% confidence level. The significant agreement between dynamo simulations in the predictive mode and past sunspot cycle observations lays the foundations of our prediction of the toroidal component of solar cycle 25 based on our dynamo model.
Our prediction of cycle 25 is presented in Fig. 4. We accommodate the range of possibilities (uncertainty) from our ensemble forecast for the solar polar field at the minimum of cycle 24 in the following way. In addition to the standard predictive mode surface flux transport run, we select the runs corresponding to the strongest and weakest polar flux realizations (at cycle 24 minima) from our set of ensemble forecast runs to drive three distinct predictive mode dynamo runs. The magenta curve beyond 2020 (marked by a vertical black solid line) in Fig. 4 depicts the dynamo predicted evolution of the buoyantly emerging magnetic flux for the standard run, which generates our most likely prediction. The two black curves beyond 2020 depict the range of our prediction based on the extreme realizations from our ensemble forecast which is found to be (2.11-2.69) × 10 23 maxwells. Our prediction (Fig. 4) shows that sunspot cycle 25 would be similar or slightly stronger in strength relative to the current cycle 24 with the standard simulation-run (magenta curve) generating a peak amplitude of 2.29 × 10 23 maxwells. A calibration with the observed amplitude of the annually averaged sunspot number time series also yields a prediction for the strength of cycle 25 in terms of the sunspot number-which is used more often in statistical or empirical forecasts. Based on our simulations, the corresponding prediction for the yearly mean sunspot number at the maximum of solar cycle 25 is 118 with a predicted range of 109-139.
Although we did not make any specific attempts to do so, serendipitously, we find that the timing of the peak of simulated sunspot cycles matches quite well with the timing of the maxima of observed sunspot cycles to within half a year on average (Fig. 4). We believe this is due to the ability of our dynamo model to match the cycle amplitudes and an existing (observed) empirical relationship between the amplitude and the rise-time Taking advantage of this, and relying upon our dynamo forward run for solar cycle 25, we infer that the maximum of solar cycle 25 will occur around 2024(±1). The range of ±1 year also includes the uncertainty in the exact timing of cycle 24 minimum which may vary by 6 months.
In Table 1 we summarize the predicted properties of sunspot cycle 25, including its amplitude, timing, and range (uncertainty) derived from our ensemble forecast.

Discussion
In summary, we have utilized a solar surface flux transport model and a solar internal dynamo model for the first, continuous century-scale calibrated simulation of solar activity. We emphasize that no SFT or dynamo model parameters were tuned after the simulations were initialized, and the simulated variations are thus a direct result of data assimilation from the SFT model to the dynamo model, only. Except cycle 19-the strongest and most extreme cycle in the last century-our simulations reproduce past solar activity peaks (and their relative variations) quite well. We note that the "floor" in the dynamo simulated toroidal field activity during minimum phases does not reach the observed lows; however, this has no impact on the peak cycle strengths or their timing-which is our focus here.
We emphasize that the excellent agreement between surface flux transport enabled simulations of solar activity and the strength of sunspot cycles over the past century, lends strong, independent support to the emerging view that the Babcock-Leighton mechanism 8,9 is currently the dominant solar poloidal field creation mechanism in the Sun 10,11,23 . Our simulations indicate that no other mechanism of variability is necessary to explain the observed variability in the solar cycle over the last 100 years. However, we do find that it is important to maintain a (non-varying) source or seed of weak magnetic field at all times (see Methods section), which has been found necessary in independent studies 40,45 and indeed expected in the SCZ 6 . In this light, our simulations and its agreement with century-scale solar activity observations strongly suggests that while a source of weak magnetic fields in the solar interior (such as the mean-field dynamo α-effect) may be necessary, the variability of the amplitude of sunspot cycles over decadal to century-scale is primarily governed by the variability in the tilt-angle and flux distribution of bipolar sunspot pairs or BMRs and their surface evolution.
We demonstrate that the prediction window for solar cycles can be extended to a decade allowing for advanced space weather assessment and preparedness. We achieve this by assimilating data from a solar surface magnetic field evolution model to a solar internal dynamo model run in the predictive mode. We predict a weak, but nonetheless, not insignificant solar cycle 25. Our ensemble forecast involving about 140 plausible realizations of the solar surface polar field provides a prediction range that indicates that sunspot cycle 25 would be similar or somewhat stronger than the current cycle 24. Based on our simulations we additionally predict that the maximum of sunspot cycle 25 will occur around 2024(±1).
It is important to note here the substantial progress achieved over the last decade in our understanding of the predictability of sunspot cycles 17,19,23 -which was spurred by substantial disagreements and controversy surrounding predictions for sunspot 1913 1923 1933 1943 1953 1963 1973 1983 1993 13 . This emergent understanding has seeded diverse physics-based predictive approaches increasing in complexityassessments based on solar surface flux transport models 31,32 , semi-empirical forecasts combining surface flux transport models with solar cycle statistics 46 -whose results seem to be convergent with our more sophisticated century-scale data assimilation approach coupling a magnetic flux transport model on the surface to a magnetohydrodynamics dynamo model in the Sun's interior. Our ensemble prediction indicates the possibility of a somewhat stronger cycle than hitherto expected, which is likely to buck the significant multi-cycle weakening trend in solar activity. Our results certainly rule out a substantially weaker cycle 25 compared to cycle 24 and therefore, do not support mounting expectations of an imminent slide to a Maunder-like grand minimum in solar activity. This had given rise to associated speculations regarding a period of global cooling (in the Earth's climate); these findings negate such possibilities at least over the next decade or so.
We conclude that near-Earth and inter-planetary space environmental conditions and solar radiative forcing of climate over sunspot cycle 25 (i.e., the next decade) will likely be similar or marginally more extreme relative to what has been observed during the past decade over the current solar cycle.

Methods
The surface flux transport (SFT) model. The basic equation. We have developed a new model to study the evolution of the Sun's photospheric magnetic field (B) which is governed by the magnetic induction equation, where v represents the large-scale velocities (both meridional circulation and differential rotation) responsible for advection of B and η represents the magnetic diffusion. Since most of the surface magnetic field is confined in the radial direction 47 , we shall solve only for the radial component of the field. The radial component B r (θ, ϕ, t) of the induction equation when expressed in spherical polar coordinates is, Here θ is the co-latitude, ϕ is the longitude, R is the solar radius, ω(θ) is the differential rotation and v(θ) is the meridional circulation on the solar surface. The parameter η h is the effective diffusion coefficient and S(θ, ϕ, t) is the source term describing the emergence of new sunspots. Since we are studying the evolution of B r on the surface of a sphere, the code has been developed using spherical harmonics.
Input parameters. The differential rotation is a large-scale plasma flow along ϕ and it varies with latitude. Therefore, the latitudinal shear of differential rotation stretches the magnetic field along the ϕ direction. As the leading and trailing spots of a tilted BMR reside in different latitudes, the differential rotation increases the longitudinal distance between the two polarities of the same BMR. The differential rotation has been modeled using an empirical profile 48 .
Wherein, ω(θ) has units in degrees per day. This profile has been validated by recent helioseismic observations 49 . The supergranular cells in the SCZ effectively diffuse the magnetic field on the solar surface. Some models 26,50 have treated convective motion of supergranules as a discrete random-walk process, rather than using a fixed diffusion coefficient, while others 31,39 , have considered a purely advective flux transport model where the convective flows of supergranules are included as a part of the velocity profile. However, we have taken a constant value of the diffusion coefficient (η h as 250 km 2 s −1 ) which lies within the range of observed values 51 . Another large-scale flow, i.e., the meridional circulation on the solar surface carries magnetic field from lower latitudes to higher latitudes. Though the flow profiles and peak amplitude of meridional circulation vary from model to model 29 , they all have some fundamental similarities. The flow speed becomes zero at the equator and the poles, and the circulation attains its peak velocity (10-20 ms −1 ) near mid-latitude. To replicate this large-scale flow we have used a velocity profile prescribed by van Ballegooijen 25 , where λ is the latitude in degrees (λ = π/2 − θ) and λ 0 is the latitude beyond which the circulation speed becomes zero. In our model, we have taken λ 0 = 75°and v 0 = 15 ms −1 .
Simulations for multiple solar cycles using observed cycle amplitudes show that the polar field systematically drifts and eventually fails to reverse its sign. There are three prescribed ways to address this problem: (1) varying the meridional circulation amplitude from cycle to cycle according to cycle strength 52 ; (2) including an additional radial diffusivity which forces polar field to decay at a time scale of 5 years 53,54 and (3) introducing a modified Joy's law, in which the tilt angles of BMRs depend on both latitude and cycle strength 28 . The latter is physically motivated and supported by independent simulations of the buoyant rise of flux tubes and we have followed this third prescription in our SFT model.
Replication of flux emergence of sunspots. Modeling of flux emergence requires information of the position of sunspots on the solar surface and the area associated with the spots. Since we do not model the growth of sunspots in our simulation, we take data at the time of their maximum surface area rather than their time of appearance on the photosphere. We assume all sunspots that appear on the photosphere are BMRs (i.e., type β). Location, area and time information of sunspots are provided by the Royal Greenwich Observatory (RGO) and USAF/ NOAA database. Beyond year 1976, the source of information associated with the active regions changes from RGO to USAF/NOAA. To maintain the consistency in area measurement from two different data sources, we multiply a constant factor of 1.4 to any active region area belonging to USAF/NOAA database if its area is smaller than 206 micro-hemispheres which correspond to a pair of sunspots each with a radius of 10 Megameters-such cross-database calibration is often necessary due to different instrument specifications and diverse record-keeping practices 55,56 . Multiple consistency checks that we performed independently point out the necessity of this correction for consistency between the RGO and USAF/NOAA sunspot area databases.
We calculate the flux associated with a BMR using an empirical relationship 14 : Φ(A) = 7.0 × 10 19 A maxwells, where A is the area of the whole sunspot in units of micro-hemispheres. This flux is equally distributed among the two polarities of the BMR. Also, we can easily determine the value of radius (say, R spot ) for each of the leading and following polarities from the area information. We assume that the radial separation (say, d) between the centroids of leading and following spots is proportional to R spot . The tilt angle (α) of the BMR is assigned in the following manner, where λ is the latitudinal position of the centroid of the whole BMR. The quantity T n accounts for the variation of tilt angle with cycle strength 57 . The factor g is introduced to include the effect of localized inflows towards active regions, that is present on the photosphere apart from the large-scale inflows related to activity belts. These localized inflows effectively reduce the latitudinal separation between opposite polarities and allow less flux to reach the polar region.
Since the polar flux is proportional to tilt angle, we incorporate the impact of these localized inflows by reducing the tilt angle 28 . We choose g to be equal to 0.7. Once we know the location of the centroid of the whole BMR in co-latitude(θ c ) and longitude(ϕ c ), the positions of individual polarities of the BMR are decided as follows: θ l=f ¼ θ c ± d 2 sinα and ϕ l=f ¼ ϕ c ± d 2 cosα, where 'l' and 'f' denote the leading and the following polarities respectively. The initial radial magnetic field associated with the BMR is ΔB r ðR ; θ; ϕÞ ¼ B l r ðR ; θ; ϕÞ À B f r ðR ; θ; ϕÞ ð6Þ where B l=f r ðR ; θ; ϕÞ are the unsigned magnetic field distribution of the leading and following spots which have opposite polarities. Each spot is modeled as 25 where β l/f are the heliographic angle between (θ, ϕ) and the central coordinates of the leading and following polarities (θ l/f , ϕ l/f ) respectively. B max is the maximum value of magnetic field of each polarity, which is automatically decided by the flux contained in the spot. Initial field configuration. We use our SFT model to study the evolution of the large-scale photospheric magnetic field for multiple solar cycles, starting from solar cycle 15 around the year 1913. As we do not have any full-sun magnetic field data at the beginning of cycle 15, we use an axisymmetric dipolar configuration 25 as an initial field condition to initiate our simulations. We have tried to minimize the difference between the polar flux associated with this initial field and the polar flux at the beginning of cycle 15 acquired from polar faculae observations 33  Measured quantities. The quantity plotted in Fig. 2a is the total unsigned magnetic flux associated with the sunspots emerging during the period spanning from the year 1913 to 2020. It includes data from direct sunspot observations (depicted by the gray curve) and also the constructed decaying phase of cycle 24 (blue and green curves). If we assume n k is the total number of individual spots appearing on the solar surface in the k th month and area of those individual spots are A i (i = 1,2, …,n k ); the total unsigned flux associated with the emerging spots (Φ k , denoted by the gray curve in Fig. 2a) in the k th month would be Wherein, Φ i (A i ) = 7.0 × 10 19 A i maxwells. We calculate the polar flux (plotted in Fig. 2b) by integrating radial magnetic field around the polar cap region (extending from ±70°to ±90°) in both hemispheres and using the following equation where λ is latitude and ϕ is longitude.
Construction of the synthetic sunspot input profiles. We consider the magnetic flux to be a better proxy of solar activity than the sunspot numbers. Thus, our synthetic sunspot data profile is mainly based on the flux evolution observed in cycle 24 so far. The time evolution profile of sunspot number during a certain solar cycle can be determined by using a generalized function with the knowledge of its starting time and peak activity 58 . For constructing a synthetic input profile, the observed sunspot data of cycle 24 (spanning over 2008. 5-2016.75) is fitted with a mathematical function, an extension of which also models the remaining 3.25 years of the descending phase of cycle 24. We further introduce random fluctuations to this mean profile to produce a more realistic (observationally) input profile. The total number of sunspots associated with a typical synthetic profile is roughly 2800. While assigning area to the spots associated with a synthetic profile, we follow a similar statistical distribution of area obtained by analyzing the observed sunspot data of cycle 24. For the time-latitude allocation of the emerging BMRs on the solar surface, we use an empirical functional form to calculate the mean latitude and the spread of the activity belts 57 . The spots are randomly distributed over all possible longitude on the solar surface. The tilt angles of the BMRs are decided by Eq. (5). We get the flux associated with BMRs using the same relationship, Φ(A) = 7.0 × 10 19 A maxwells. We constructed a set of thirty-four different synthetic input profiles by modulating the total flux associated with the sunspots, or by varying the latitudinal spread (and interchanging their relative position) in the activity wings. Among these thirty-four profiles one closely follows the already observed (up to September 2016) sunspot distribution of cycle 24, and we regard this profile as a standard one. The modeling of twenty-four synthetic input profiles was done by varying the total sunspot-associated flux by ±30% about the standard input profile.
Once we have all particulars related to the sunspots of a certain synthetic profile, we consider only the last 3.25 years of the profile and add them to the existing observed sunspot data of cycle 24 to model likely emergence profile up to the end of 2019.
Introduction of randomness in tilt angle of active region. We generate 110 synthetic input profiles where we introduce randomness in the tilt angles of active regions in addition to the systematic tilt which is entirely determined by Joy's law. The scatter around the systematic mean tilt angle decreases with increasing active region area such that the standard deviation of the distribution of tilt angle randomness follows a linear logarithmic relation with active region area 36 . The tilt angle of each active region is determined by the relation, α ¼ gT n ffiffiffiffiffi jλj p þ ε, where the first part in the right-hand side is the same as Eq. (5) and the second part, ε, represents the randomness 36 . Every individual value of ε is selected randomly from a normal distribution with zero mean and a standard deviation that is decided by the area of the particular active region in consideration. We apply this method to every active region of the standard input profile and generate a set of 50 such realizations. We also consider 60 different input profiles where scatter in the tilt angle is introduced in the strongest and the weakest (according to total sunspotassociated flux) profiles to model the maximum uncertainty that can be present in the descending phase of cycle 24.
The solar dynamo model. Most of the existing solar dynamo models 17,42,43,59 identify Babcock-Leighton mechanism as the sole process for generation of the poloidal component (B P ) from the toroidal component (B T ) of the magnetic field. However, other studies 40,45 have established the mean field α-effect, present in the bulk of the SCZ, as an essential means for reproducing important observational features. In our case, we have used a solar dynamo model which includes both B-L mechanism and mean field α-effect for conversion of B P from B T . In the following section, we shortly describe the model that we have used. The same model has provided satisfactory results previously 40 . The axisymmetric dynamo equations solved in kinematic regime are, where, B(r,θ) (i.e. B ϕ ) and A(r, θ) are the toroidal and the poloidal (in the form of vector potential) components of the magnetic field respectively. Here Ω is the differential rotation, v p is the meridional flow and s = r sin(θ). This model presumes different diffusivity profiles for the toroidal and poloidal components of magnetic field: η t and η p , respectively. In Eq. (10) 'αB' is the source term for generating B p and α incorporates contributions from both the B-L mechanism and mean field α-effect. The details of every profile and parameter used in this model are elaborately described in an already published work 40 . We note that no intrinsic amplitude fluctuation (over time) of the 'αB' is present in the model; the variation in poloidal field source term is being introduced only through the inclusion of the SFT results at every cycle minimum. We incorporate results from SFT simulation into the dynamo model only during cycle minima, an approach similar to the earlier effort of predicting cycle 24 using observed surface magnetic field 16  which is related to A SFT ðR ; θ; t min Þ by where we calculate B SFT r ðR ; θ; t min Þ by averaging surface magnetic field over the ϕdirection during solar minima (t = t min ). We obtain A SFT on the surface for two hemispheres by using following relations, Now the dynamo generated A Dyn at solar minima is calibrated with the SFT generated A SFT . We achieve these via two steps. In the first step we calibrate the amplitude of these two quantities at the solar surface by a factor (c), which once determined at the minimum of cycle 16, remains constant throughout our simulation. The A Dyn ðR ; θ; t min Þ sinθ has different latitudinal distribution on the solar surface compared to A SFT ðR ; θ; t min Þ sinθ. In the second step of calibration, A Dyn ðR ; θ; t min Þ sinθ obtained from the dynamo simulation at every minimum is multiplied by a function γ(θ) to make the product equal to A SFT ðR ; θ; t min Þ sinθ on the solar surface. In Fig. 5a, the solid orange and the dashed blue lines depict the scaled A SFT ðR ; θÞ sinθ and A Dyn ðR ; θÞ sinθ, respectively, at the minimum of cycle 16. Owing to the sinθ term in the denominator, numerical issues arise while calculating the function γ(θ) near the polar regions. To avoid this, we set γ(θ) to be equal to one near the poles (see Fig. 5b). We assume that the poloidal field forcing ("correction") on the dynamo due to the B-L mechanism is restricted to a region extending from 0.8 R to R . During every solar minimum, we stop the dynamo simulation and multiply A Dyn (r, θ, t min ) sinθ with the appropriate cγ(θ) for every grid point above 0.8 R and resume the simulation. At every subsequent minimum until cycle 24 minimum, this data assimilation from the SFT to the dynamo model is repeated; this generates our data-driven prediction for cycle 25. In spirit, this assimilation is akin to enforcing the data-driven SFT simulated surface map in the SCZ (the dynamo domain). Any transient discontinuities resulting from this "driving" is observed to disappear within a month, and the input is fully assimilated in the dynamo simulation.
The dynamo simulation provides a proxy for the toroidal magnetic field at the base of the SCZ which upon satisfying the magnetic buoyancy 40,42,43 will appear as sunspots on the solar surface. We utilize the total erupted field, B Dyn (t), as a proxy for total erupted sunspot flux (this is possible because each eruption has the same extent in radial and latitudinal grids). Therefore, we can compare B Dyn (t) with the unsigned flux associated with the observed sunspots (as depicted in Fig. 4). The dynamo simulated B Dyn (t) is calibrated with the observed unsigned magnetic flux through the utilization of a constant factor which remains the same throughout the simulation. This constant factor is determined through a multi-cycle (cycles 17-24) calibration of the annually averaged peak cycle strength (at each maximum) derived from the dynamo simulation, say B max (n) and the corresponding peak of the observed annually averaged, unsigned flux Φ obs max ðnÞ Â Ã . We scale all B max (n) values corresponding to each maximum with the same constant factor, and vary the latter until the B max (n) versus Φ obs max ðnÞ is characterized by a line with unit slope and zero intercept. We select this particular constant as the scaling factor which operates upon the whole B Dyn (t) time-series. The result is depicted in Fig. 4. A similar multi-cycle calibration technique is implemented to generate the amplitude prediction and range of the ensemble forecast in terms of the yearly mean sunspot number for cycle 25 (as reported in Table 1). This is achieved by calibrating the observed annually averaged peak sunspot numbers for cycles 17 to 24 maxima with the simulated peaks of the corresponding cycles.
At no point in our century-scale simulations is any individual scaling done to the amplitude of any single cycle, or any model driving parameters fine-tuned. This maintains the sanctity of these long-term data-driven simulations.
Code availability. This work utilizes two disparate numerical codes for simulating magnetic field evolution on the solar surface and within the Sun's convection zone, respectively. Informed requests from established scientists for numerical simulations pertaining to this study may be entertained by the Center of Excellence in Space Sciences India. Such requests may be made through email to the corresponding author.