Development of a local empirical model of ionospheric total electron content (TEC) and its application for studying solar-ionospheric effects

Regular and irregular variations in total electron content (TEC) are one of the most significant observables in ionospheric studies. During the solar cycle 24, the variability of ionosphere is studied using global positioning system derived TEC at a mid-latitude station, Tehran (35.70N, 51.33E). Based on solar radio flux and seasonal and local time-dependent features of TEC values, a semi-empirical model is developed to represent its monthly/hourly mean values. Observed values of TEC and the results of our semi-empirical model then are compared with estimated values of a standard plasmasphere–ionosphere model. The outcome of this model is an expected mean TEC value considering the monthly/hourly regular effects of solar origin. Thus, it is possible to use it for monitoring irregular effects induced by solar events. As a result, the connection of TEC variations with solar activities are studied for the case of coronal mass ejections accompanying extreme solar flares. TEC response to solar flares of class X is well reproduced by this model. Our resulting values show that the most powerful flares (i.e. class X) induce a variation of more than 20 percent in daily TEC extent.

In the Earth's ionosphere, the variability of space weather is easily reflected in TEC. As the total number of electrons is measured along a vertical column of one square meter cross-section (1 TEC Unit (TECU) = 1 × 10 16 electrons m −2 ) from the height of a GPS satellite ( ∼ 20, 000 km ) to the receiver, thus TEC characterizes variations in both ionosphere and plasmasphere 1 .
Until now, various techniques have been used to empirically measure TEC. Some examples include: ionosondes 2,3 , incoherent backscatter radars 2,4,5 , Faraday Rotation (FR) in beacon satellite signals 2,6-8 , altimeter satellite systems, and Global Navigation Satellite Systems (GNSS) 1,9 . The Global Positioning System (GPS) satellites, provide an effective and low-cost method to measure TEC values 10,11 as a function of time for a specific location on the Earth. GPS signal, propagating through the ionosphere, is advanced in phase and delayed in time. As a result, values of carrier phase and pseudo-range combined L-band frequencies (L1 carrier:1575. 42 MHz and L2 carrier: 12227.60 MHz) are used to evaluate TEC [12][13][14][15] .
TEC values are subject to both temporal/spatial and regular/irregular variations 16,17 . Spatial variations describe those related to the location on the Earth (i.e. equatorial anomalies etc), whilst temporal variations are related to time (i.e. universal or local). Where, regular variations include periodic changes in TEC values, irregular ones show the temporal effects of phenomena such as solar events and geomagnetic storms.
Investigating TEC variations reveals the main physical processes which are responsible for the ionospheric behaviour. Generally speaking, the changes in TEC values are mainly connected with: the condition of the Earth's magnetic field, the Earth's rotation (which induces diurnal effects), the Earth's position around the Sun (which induces observed seasonal effects) and the solar activity levels. Whilst diurnal and seasonal effects are considered as regular effects, the solar activity levels and its effect on the Earth's magnetic field may produce both regular and

Justification and outline of our semi-empirical model
Due to the Earth's magnetic field, three latitudinal regions are recognized in the ionosphere: low-latitude or equatorial, mid-latitude and high-latitude regions. Usually, the low-latitude region contains the highest values of TEC whilst the mid-latitude region is considered as the least variable region of the ionosphere and it contains the most predictable variations of TEC 9 . It is shown that even in deep solar minimum a strong correlation with the solar indexes still exist 13 .
When developing a semi-empirical model, it is essential to remove the disturbed periods of geomagnetic storms 44,45 . The disturbance degree is directly related to the strength of the disturbing phenomena.
The strength of a geomagnetic storm is usually measured using geomagnetic indexes. Amongst them Kp, Dst (disturbance-storm time) ( ∼ 20 − 30 • latitudes), SYM-H and ASY-H ( ∼ 40 − 50 • latitudes) indexes [46][47][48] provide good information about the storm condition. Generally, after the storm sudden commencement, three phases are recognized during a storm 49 : The initial (with an increase in Dst by 20 to 50 nT; for tens of minutes); main (with a Dst decreasing to less than −50 nT; 2-8 hours) and recovery phases (Dst changes from its minimum value to its quiet time value; 8 hours and above) 44 .
SYM and ASY (both -D and -H) indexes are acquired from observations of magnetic fields at low and midlatitudes (WDC, Kyoto) and describe the development of a magnetic storm. Compression of the dayside magnetosphere in the initial phase of a storm induces positive Dst values (as well as positive SYM-H values) whilst magnetic reconnection and ring current formation induce strongly negative values during the main phase 50 . SYM-H index is considered to be an analogues of Dst in many studies [51][52][53] . On the other hand, it is shown that for Dst variations greater than 400 nT, these two values may differ 54 . In more detailed studies, it is recommended that SYM-H index can be used as a high-resolution Dst index 55,56 , of course with different scales for the definition of moderate storms 56 .
We have considered moderate, intense and super-storms (moderate: − 50 nT > Dst min > − 100 nT ; intense: − 100 nT > Dst min > − 250 nT ; super-storm: Dst min < − 250 nT; During quiet times: − 20 nT < Dst < + 20 nT) to be the most probable reason of disturbed time periods and consequently designed a suitable filter to detect and remove them.

Database
The GPS-TEC data 57,58  Production of ionization is mainly controlled by the solar EUV radiation. Due to unavailability of a suitable database of solar EUV radiation, solar radio flux (i.e. F10.7) is considered as a substitute index of solar activity which is reflected in our model. The daily F10.7 data were collected from OMNIW EB. Other parameters concerning IMF, solar wind and plasma parameters and activity indexes were acquired from OMNIW EB: High Resol ution OMNI.
Processing the above data, the hourly/monthly mean values of the solar index (F10.7) and the ionospheric parameter TEC were prepared which allow us to study the variability of TEC with solar events. Solar events in the above time interval were selected from Watanabe et al (2012) 59 XRT flare catal ogue and compared with the information from SOHO LASCO CME catal ogue. To study the solar radio bursts RADIO SOLAR TELESCOPE NETWORK 1 Sec Solar Radio Data (SRD) files from RSTN were used.
SYM and ASY (both -D and -H) indexes are acquired from observations of magnetic fields at low and midlatitudes (WDC, Kyoto).

SYM-H versus Dst
It is not easy to find a unique description for the storm's degree based on SYM-H values, specially for the common definition of a moderate storm based on Dst values. In a closer look Dst and SYM-H do not behave like each other as it is previously mentioned 55 . In some cases a moderate storm condition starts with a SYM-H index lower or higher than its Dst value (it can fluctuate around 20 nT). Thus, first we decided to re-scale SYM-H based on Dst for solar cycle 24.
A first and a simple choice was to use the same limits calculated for the period of 1985 through 2009 56 or 1981 through 2002 55 . For instance 56 : gives a SYM-H index of − 43 nT for starting moderate storms (i.e. Dst of ∼ −50 nT). A sample of "storm time intervals" consisting of the time intervals of 1000 storms (of the moderate and above degrees, picked by their Dst values) was used as the test sample for solar cycle 24. As our first step, the entire period of 2009-2018 in the search of a proper lower limit of SYM-H (i.e. for marking moderate storms and above) was studied.
For instance, a comparison between Dst and SYM-H indexes is shown in Fig. 1, where it is seen that in the case of moderate storms, considering − 43 nT as the lower limit of SYM-H leads to the detection of more storms, so the starting value is important. Using − 46 nT gives better result for our data set in this period.
Considering the behavior of power spectrums of Dst and SYM-H (Fig. 2), some differences are seen (compared with the period of 1981-2002 55 ). Our power spectrums seems more noisy (Fig. 2)        Compared with the method used by Wansliss and Showalter 55 , we see three spectrums. One Active and two Quiets (QI and QII). QI and QII cross the Active spectra at − 46nT and − 28nT respectively.
Through correlation studies for the time period of 1960-2001, the behavior of geomagnetic indices are shown to be correlated the best with Interplanetary Magnetic Fields (IMF) 60 embedded within the solar wind. Solar magnetic field originates in convention layer and extends into the corona and the solar wind. Fast solar wind originates from coronal holes whilst slow solar wind originates at the edge of coronal holes 61 . Solar wind carries the strongest fields at solar maximum which are due to interplanetary coronal mass ejections and at this period the Earth experiences a broad range of solar wind velocities 62,63 . Around solar minimum, the coronal holes are located at the poles. When the magnetic quadrupole moment dominates over the dipole moment, a number of coronal holes appear at mid-latitudes, this is a typical behaviour in a solar cycle during solar maximum. At solar cycle 24 a deeper decrease of dipole component occurred in solar minimum 63 . During the deep solar minimum between cycles 23 and 24, the evolution of coronal holes and its connection to solar wind speed is discussed in details 64 and a secondary peak in solar wind speed distribution is seen for 2007-2008. During solar cycle 24, solar wind speed is shown to have the highest correlation with geomagnetic indices, Ap and Dst, with zero time delays 65 . Jackson et al. 66 , used Current Sheet-Source Surface (CSSS) model 67 to determine Geocentric Solar Magnetospheric (GSM) B z field. They found that the daily variations of B z are also correlated with geomagnetic Kp and Dst index variations over 11-year period of National Solar Observatory Global Oscillation Network Group (data. GONG). Thus it seems that the existence of 2 quiet spectrum in Figs. 3 and 4 is not accidental and may reflect the different situation of the solar cycle 24.
More numerical studies help to decide about the filters for executing disturbed time intervals. A linear fit to the scatter plot of SYM-H versus Dst (Fig. 5) gives the linear relationship: Inserting the limit of − 50 nT for Dst gives the limit of SYM-H ∼ − 45.81 nT, comparable with the method applied by Katus et. al. 56 (i.e. formula 1).

Applied filter
Instead of considering Dst variations 68 , our storm finding procedure is based on the variation of SYM-H (formula 3) whilst ASY-H is used to find the storm onset times. Different steps of our filtering algorithm are listed as below: • Generally, SYM-H values below − 46 nT were considered as the start of a possible storm.
• A second filter is considered to detect storm onset times more precisely using ASY-H and SYM-H values.
Observing a sharp positive peak in ASY-H values shows the sub-storm onsets 69 prior to / or after occurrence of a moderate geomagnetic storm 70,71 . So in this method the onset times of the storms were highlighted observing the behavior of ASY and SYM (especially H) indexes. • For the storm time periods if there is only one record with the value of "at or below" − 46 nT in the selected period, the time period was not removed. • The recovery time also is considered (forwarded in time from the selected starting point) using ASY-H, up to the deepest local minimum after the starting point (if it does not result in less than 2 hours). • The specified time periods were recorded and then excluded from raw data of TEC before calculating the desired mean values.
We examined our procedure for a test sample of storm time intervals during 1/1/2017-1/1/2018. The coincidence was 91.2 % using only steps 1 and 2, 97 % when adding step 3. The above method is applied upon the whole time interval of cycle 24, backward in time. It is obvious that in comparison with the method applied by Badeke et al. 68 , we remove less time periods (they have removed 36 hours for each considered storm).
As an example of how this procedure works, the time interval of a geomagnetic storm (22 April 2017) (SGAS Number 112 Issued at 0254z on 22 Apr 2017) is shown in Fig. 6.
To study this storm a time period of 5 days is drawn (2 days before and 2 days after the reported date). The data was acquired from WDC, Kyoto.
Applying this method, in the first step 58 points were recognized with a SYM-H value below − 46 nT. The first recognized occurrence of a SYM-H value below − 46 nT is on DOY 113, 23 April 2017, 00:08:00. Prior to this time, considering the variation of ASY-H, the start of a sub-storm is recognized on 22 April 2017, Thus, for geomagnetic storm of 22 April 2017 (above example), a sum of 08:27:00 hours was excluded from TEC raw data.

The model
The  Considering Fig. 7, the moderate linear behavior of solar cycle 24 is seen in descending phase, whilst a good correlation is obvious in ascending phase (see Fig. 8).
Thus, the whole data period has been classified in the ascending (2009-2013) and descending (2014-2018) phases of the solar cycle and the monthly mean values of TEC are calculated.
The coefficients of Eq. 4 were calculated for every hour of a day in a given month by linear regression. For each part (i.e. descending and ascending phases), a 12*24 (12 months of the year × 24 hours of the day) matrix for M and B is generated. For example in Fig. 8, the linear regression and correlation coefficients (R) between monthly mean values of TEC and F10.7 during January at different local times were shown for ascending and descending phases.
Contour plots of TEC, normalized at F10.7 = 100 SFU, for various months at different local times are shown in Fig. 9, for ascending and descending phases. Note that lower values of TEC are expected in comparison with similar plots for equatorial stations.
The seasonal variation displays a semiannual variation with higher values around equinoxes and lower values around solstices. For instance, a comparison is made in Fig. 10 for ascending and descending phases which shows an asymmetric double peak in seasonal variation. In Fig. 11 daily mean TECU in ascending and descending phases were compared. An important feature of this semiannual variation is the local time dependence of the asymmetrical peak amplitudes. Seasonal anomaly is explained in terms of changes in solar zenith angle and thermospheric composition, especially the ratio [O] [N 2 ] 73,74 . Considering seasonal and diurnal normalized values of TEC, a sinusoidal behavior is observed during ascending and descending phases. This is the reason Fourier analysis is used in the following.
Fourier analysis was used to investigate the relation of regression coefficient matrixes, M and B with month and hour numbers. To achieve positive values for the elements of B matrix, restricted linear regression method is used. Simply, we work with an image for which the discrete values of m and t are spatial coordinates. In the following 2D discrete Fourier analysis, is used.
If M(m, t) represents the values of matrix elements resulted by Eq. 4, then a 2D discrete Fourier transform of M(m, t) is shown by: for which M and T are equal to the dimension of our matrixes, 12 and 24 respectively. The inverse 2D discrete Fourier transform which reproduces the original matrix now is: Applying the same method for B and considering suitable filters to remove noises from high frequency signals, our linearized model ( 4) is extended as below:  Using proper low and high-pass filters it is possible to decompose the original matrix to the components of desired frequency bands. This method is suitable to study the main frequencies in our linearized model. Here we have applied low-pass filters to remove high frequency noises and the main frequencies were used to reconstruct TEC values. Thus the model is re-written using new values of M(m, t) and B(m, t) (respectively, G 1 (m, t) and G 2 (m, t))).
Applying this model, it is possible to compare resulting values with the observed values. Thus, for solar cycle 24 we have marked a set of time intervals for which, a positive residual exist.
In our method the accuracy of the time interval was one hour, but it is possible to increase the accuracy to one minute by considering proper mean values of experimental GPS-TEC (i.e. Fig. 9).   Figure 13 shows diurnal variation of monthly mean values of TEC for different months of the ascending phase. It is clear that TEC gradually increases with sunrise, reaches a peak at around 11:00-14:00 LT and later declines to reach the minimum value after midnight.
The final results of the model are presented in Figs. 14 and 15 . Figure 14 shows the normalized mean values of TEC for ascending and descending phases, whilst Fig. 15 demonstrates the observed mean values for Tehran station in comparison with the model mean values.
Now it is possible to study the effects of CME and solar flares on TEC values. As an example, the impact of X-class solar flares is investigated in the following section.

The impact of X-class solar flares of cycle 24
We apply our method to the full data-set without removing the disturbed time intervals (as in Sect. 6), resulted values are represented by TEC(E), in Table 2. Thus TEC(E), will possibly contain both regular and irregular effects 1. We also presented values estimated by our full criteria (Sect. 6), represented by TEC(MM) in Table 2. Finally, the resulted values are compared against TEC values of a standard plasmasphere-ionosphere model (Inter natio nal Refer ence Ionos phere , IRI), represented by TEC(I) in Table 2.
Our results for four flare events, are shown in Fig. 16a-d. In comparison to IRI model, our model results in smoother curves for TEC.
For completeness of the research, we studied the time intervals of class X solar flares in solar cycle 24 59   www.nature.com/scientificreports/ solar radio bursts from (RSTN) also are considered in which 1 second records of eight discrete solar radio flux measurement were presented. A set of 34 flares of class X were considered for the present work (Table 1). For Tehran station, no TEC values existed for 4 flares out of 34 (Table 1) 59 XRT flare catal ogue) is seen at 16:06:00 UT with a CME accompanying (SOHO/ LASCO Halo CME Catal ogue). Though it was one of the most powerful events in solar cycle 24, its effect on TEC values was not considerable for Tehran station, just a second peak reached to our calculated values. Figure 16b 59 XRT flare catal ogue) is seen at 12:02:00 UT with a CME accompanying (SOHO/ LASCO Halo CME Catal ogue). A second peak in TEC, is seen for ∼ 14 to 15 pm in Fig. 16b, and experimental TEC values are well above our calculated values. Solar event of 10-Sep-2017, is well modeled 75 . Though the CME eruption was catalogued as Halo, with a Central Position Angle (CPA) of 360 degrees (SOHO/ LASCO CME Catal ogue), it does not produced an Earth directed Interplanetary CME (ICME). During this event, three CMEs propagated and merged into a complex ICME with main direction towards Mars 75,76 . So despite of an X8.3 flare, solar events were not geoeffective and excited no Forbush Decrease (FD) or powerful Geomagnetic Storms (GMSs). In comparison, as we see in Fig. 16b, solar flare of 6-Sep-2017, was accompanying with a Halo CME affected with two prior CMEs from the same active region (SOHO/ LASCO CME Catal ogue). The interaction of these ICMEs together and with high speed stream   59 XRT flare catal ogue), no halo CME event is recorded by SOHO/ LASCO Halo CME Catal ogue. The flare occurrence is recorded on 00:28 UT and TEC values at Tehran station is disturbed dramatically around noon. Due to (SOHO/ LASCO CME Catal ogue) 6 CMEs are recorded from ∼ 1 am to 21 pm. A partial halo CME occurred at 01:25 am with a CPA of 216 degrees. This event is not followed more, but possibility of a G1 minor geomagnetic storm was reported (Space Weath er Predi ction Center).
For the time interval of 10 am to 22 pm no data exist in all 8 frequencies, but there are few bursts which seems to be responsible for TEC variations of Fig. 16d.
Due to this study, it is seen that a combination of solar events are responsible for TEC variations. But the effect of Solar flares and bursts with radio emissions higher than daily F10.7 values, were detected more clearly at the mid-latitude station of Tehran (the situation might be quite different, e.g., at high latitude stations).
Our resulted values shows that the flares with the most power (i.e. class X) have induced a variation of more than 20 percent in TEC. In some cases for the flares with accompanying CMEs, the variation maybe extended

The results and conclusions
• The probability density of Dst and SYM-H (Figs. 3 and 4) for the time interval of 1981-2018 were formulated as the sum of 3 gaussian distributions: • Solar flares of class X in solar cycle 24 were studied for completeness. Figure 16a,b and Tables 1 and 2.
• In the absence of any halo CME, Earth directed ICMEs and streamers from the Sun, solar radio bursts of 25-Dec-2014 (Fig. 17) are probable source of TEC variations (Fig. 16d). • Through this method the effect of Solar flares and bursts with radio emissions higher than daily F10.7 values, were better detected. • Our resulted values shows that the flares with the most power (i.e. class X) have induced a variation of more than 20 percent in TEC.