Threshold-based evolutionary magnitude estimation for an earthquake early warning system in the Sichuan–Yunnan region, China

The Sichuan–Yunnan region is one of the most seismically vulnerable areas in China. Accordingly, an earthquake early warning (EEW) system for the region is essential to reduce future earthquake hazards. This research analyses the utility of two early warning parameters (τc and Pd) for magnitude estimation using 273 events that occurred in the Sichuan–Yunnan region during 2007–2015. We find that τc can more reliably predict high-magnitude events during a short P-wave time window (PTW) but produces greater uncertainty in the low-magnitude range, whereas Pd is highly correlated with the event magnitude depending on the selection of an appropriate PTW. Here, we propose a threshold-based evolutionary magnitude estimation method based on a specific combination of τc and Pd that both offers more robust advance magnitude estimates for large earthquakes and ensures the estimation accuracy for low-magnitude events. The advantages of the proposed approach are validated using data from 2016–2017 and the Ms 8.0 Wenchuan earthquake in an offline simulation. The proposed concept provides a useful basis for the future implementation of an EEW system in the Sichuan–Yunnan region.

www.nature.com/scientificreports/ The Sichuan-Yunnan region, located at the eastern edge of the Qinghai-Tibet Plateau, coincides with the middle to southern section of the north-south seismic zone that extends across the central part of the Chinese mainland. The high level of seismicity in this region is associated with its complex tectonic setting. The Sichuan-Yunnan region has experienced numerous destructive earthquakes that have caused extensive casualties and property losses; remarkable examples include the Ms 6.4 Ning' er earthquake 15 on 3 June 2007, the Ms 7.0 Lushan earthquake 16,17 on 20 April 2013, and the Ms 6.5 Ludian earthquake 18 on 3 August 2014. Of particular note is the Ms 8.0 Wenchuan earthquake, which occurred on 12 May 2008 and resulted in more than 69,000 deaths and financial losses reaching hundreds of billions of RMB 19 . As the effects of potential earthquake hazards will continue to increase with the rising population and economy in this area, scientific research on diminishing future earthquake hazards is essential for the Sichuan-Yunnan region. In this context, to strengthen seismic prevention and disaster mitigation measures therein, the Sichuan-Yunnan region has been designated the key monitoring area of the Seismic Intensity Rapid Reporting and Earthquake Early Warning (SIRR & EEW) project, and developing an EEW system for this region is of great concern.
Rapidly estimating the magnitude of an earthquake event is the most critical obstacle to the widespread development of EEW systems and is typically achieved based on empirical scaling relationships between the magnitude and early warning parameters from initial P-waves. Research has confirmed that the average period (τ c ), the most commonly used parameter for estimating the magnitude with regard to the P-wave frequency, is correlated with the earthquake magnitude 1,20 . In addition, the peak amplitude of the vertical displacement (P d ) in the first few seconds of the P-wave has been shown to be related to the event magnitude and is used mainly as information about the amplitude content of early P-waves to predict the earthquake magnitude 21,22 . Models based on τ c and P d have been proposed for the EEW systems in China to estimate the earthquake magnitude using only sequence data for large events that occurred in the Sichuan-Yunnan region 23,24 . Fortunately, with the operation of the China Strong Motion Network Center (CSMNC), large quantities of high-quality digital strong motion data have been recorded, constituting a reliable database from which the relationships between the event magnitude and the EEW parameters τ c and P d can be established for the Sichuan-Yunnan region.
Previous studies have shown that τ c typically exhibits large scatter in the low-magnitude range 25,26 , and the application of P d to predicting the magnitude of large events is limited by the length of the P-wave time window (PTW) 27 . For the reliable and early assessment of destructive earthquakes, Wu and Kanamori 20 originally proposed the integrated use of τ c and P d to establish a threshold-based EEW system, which was subsequently applied in various areas 26,28 . However, low signal-to-noise ratio (SNR) records from low-magnitude events may produce an unexpectedly large τ c value 29 , in which case the direct combination of these two parameters with such an unexpected error may overestimate a small event, resulting in a false alarm.
In this paper, we explored a similar magnitude estimation approach with a special combination of τ c and P d . We proposed first judging whether τ c should be combined with P d to improve the magnitude estimation stability, and then we further explored the evolutionary magnitude estimation of the proposed approach. Here, we selected 1596 vertical components from 273 events that occurred within the Sichuan-Yunnan region during 2007-2015 as the fitting dataset for linear regression; furthermore, data from 13 events that occurred in the same region during 2016-2017 are employed as the validation dataset to evaluate the reliability of the proposed approach (Supplementary Table S1). The event magnitudes are in the range of 4.0-8.0, and all magnitudes (typically surface wave magnitudes) are denoted M. Since the high-frequency direct body waves radiating from a crustal earthquake rupture dominate in amplitude within the near-source range 21 and the records of near-source stations represent the earliest information for rapid assessment and EEW, only records from stations with an epicentral distance of less than 60 km are considered. To ensure good station coverage for each event, at least three strong motion records for each earthquake must be available. The locations of the epicenters and CSMNC strong motion stations of the fitting dataset are shown in Figs. 1, and 2 illustrates the distribution of the strong motion records within the fitting dataset as a function of both their epicentral distance and their magnitude.

Results
Magnitude relationships with τ c and P d . We calculated τ c from the vertical components of the strong motion records in the fitting dataset to investigate the relationship between τ c and M. As illustrated in Fig. 3a, τ c exhibits a linear correlation with the event magnitude and can effectively represent the scale of earthquakes. However, we found high τ c values in the low-magnitude range for some records, resulting in large dispersion for events with M < 5.5. These abnormal τ c values are usually affected by low-frequency drift of the integrated displacement with low-SNR or low-amplitude data, which may introduce large errors in τ c and lead to false alarms. Following Zollo et al. 16 , we found that P v (the peak amplitude of the velocity for a PTW of 3 s) reliably represents low-SNR data and is closely related to the displacement drift during the computation of τ c . Accordingly, we set a threshold of P v = 0.05 cm/s to identify low-SNR waveforms and found that P v < 0.05 cm/s typically corresponds to low-magnitude events with abnormally high τ c values (Fig. 3a). Hence, to improve the stability of τ c , when P v < 0.05 cm/s, we applied a stronger high-pass filter with a cutoff frequency of 0.15 Hz (increased from 0.075 Hz) to the integrated displacement. Only one earthquake record with a magnitude larger than 6 in our dataset features P v < 0.05 cm/s, as highlighted by the arrow in Fig. 3a, indicating that our threshold (P v = 0.05 cm/s) is small enough that the stronger filter (0.15 Hz high-pass filter) will rarely be applied to large events, thereby avoiding the loss of considerable low-frequency components.
After applying this criterion to restrict low-SNR data, the best-fit relationship between the magnitude and mean τ c value of each event for PTW = 3 s was established, and the result is illustrated in Fig. 3b.
The magnitude is estimated as follows. Next, we calibrated the P d attenuation relationship and established a correlation between the distance-corrected P d 10km and magnitude for every PTW within 2-10 s (see Methods section). The calibrated coefficients are summarized in Supplementary Table S2. As an example, for PTW = 3 s, the following best-fit regression and magnitude estimation equations were obtained.
Our results show that P d 10km achieved a deviation of 0.463 magnitude units, which is smaller than that of 0.694 magnitude units achieved by τ c for PTW = 3 s. However, the low P d 10km values obtained for the M = 8 event deviate considerably from the expected relationship in the first 2-3 s (Fig. 4); this is consistent with previous studies, which have shown that P d underestimates the magnitudes of large events for short PTWs 27,30 . In our study, P d 10km for the M = 8 event increased with increasing PTW, and the magnitude saturation phenomenon disappeared gradually when the PTW reached 8 s. Over the PTW range considered herein, the magnitude deviations of all the events changed from 0.469 magnitude units (PTW = 2 s) to 0.357 magnitude units (PTW = 10 s) (Supplementary Table S2).   www.nature.com/scientificreports/ difference within each bin. Moreover, the standard deviation of the magnitude estimates was also given to show the discreteness of the magnitude estimates. For a short PTW of 3 s, the P d 10km -magnitude relationship established above achieve a correlation coefficient of 0.815, which is larger than the coefficient of 0.683 from the τc-magnitude relationship, indicating that P d 10km shows a higher magnitude correlation for PTW = 3 s compared to τ c over the magnitude range from 4 to 8. As Fig. 5 indicates, for small-moderate events (M < 6.5), P d 10km predicts the magnitude with a standard deviation of approximately 0.40 magnitude units for each bin, whereas most standard deviations of τ c are larger than 0.6 magnitude units. To clearly depict the performance of each parameter, we also provide subplots of the estimated error distributions for small-moderate events (M < 6.5) in Fig. 5, showing that 76.4% of all events have an estimated error of less than 0.5 magnitude units using P d 10km compared with 54.3% using τ c . Furthermore, the estimated errors of P d 10km for these events are all within 1.5 magnitude units, whereas 12 events exceed this value for τ c . In contrast, for the large-magnitude range, τ c shows a smaller underestimation error (0.6 magnitude units) when estimating the largest event (the Ms 8.0 Wenchuan earthquake) compared to P d 10km (1.7 magnitude units). We further studied the changes in the τ c and P d 10km magnitude estimates with an extended PTW. For P d 10km , with PTW increasing from 2 to 10 s, the percentage of events with an estimated error within 0.5 magnitude units gradually increases from 74.9 to 84.2%. Although the correlation coefficient increases only from 0.812 to 0.877, the underestimation error for the largest events at PTW = 2 s (1.7 magnitude units) shows obvious improvement and eventually reaches an underestimation error of 0.2 magnitude units once the PTW reaches 10 s. We also explored the effect of increasing the PTW on τ c . As shown in the bottom row of Fig. 5, the τ c -estimated magnitude error in the small-moderate magnitude range gradually decreases as the PTW is extended, but the www.nature.com/scientificreports/ discreteness of τ c is still greater than that of P d 10km under the same PTW. Moreover, we found that τ c exhibits excellent performance in predicting the largest M = 8 events with an underestimation error of 0.2 magnitude units when PTW = 10 s, but unlike the consistent, stable improvement of P d 10km in predicting large magnitudes with increasing PTW, the underestimation error by using τ c increases from 0.6 to 1.0 magnitude units when PTW extends from 3 to 6 s.
Thus, the magnitudes of large events can be predicted by τ c with an acceptable error using only 2 s of data after the P-wave arrival, and P d 10km is robust in predicting small-moderate event magnitudes. In addition, τ c results in larger dispersion than P d 10km in predicting small-moderate earthquake magnitudes. This is the result after applying the criterion restricting low-SNR data; otherwise, the dispersion would be much larger. Ultimately, with increasing PTW, the saturation of P d 10km in predicting large event magnitudes can be obviously improved, while the improvement of τ c is not consistent, and more data are needed for validation.
Threshold settings for τ c and P d 10km . Based on the above characteristics of τ c and P d 10km over different magnitude and PTW ranges, we proposed a threshold-based evolutionary magnitude estimation method that suggests the use of P d 10km to predict events within a wide magnitude range and jointly using τ c to reduce the underestimation error only for large events.
The thresholds of τ c and P d 10km were determined to discriminate large events (M > 6.5). To avoid excluding large earthquakes, we selected − 1σ standard deviation, and we set the τ c and P d 10km thresholds corresponding to M = 6.5 to be 1.018 s and 0.387 cm, respectively, for PTW = 3 s (Supplementary Fig. S1). The threshold corresponding to M = 6.5 for every PTW considered for P d 10km is presented in Supplementary Table S3. According to a comparison between the real-time calculated parameters and the established thresholds, four potential situations are encountered: (1) τ c > threshold and P d 10km > threshold; (2) τ c > threshold and P d 10km < threshold; (3) τ c < threshold and P d 10km > threshold; and (4) τ c < threshold and P d 10km < threshold. We defined a four-entry decision table corresponding to the above situations: (1) the event is most likely to be a large earthquake; (2) the event may be a large earthquake with a small P d 10km caused by a short PTW or a small-moderate earthquake with an unexpectedly large τ c ; (3) the event may be a large earthquake with an unexpectedly small τ c or a smallmoderate earthquake with a large P d 10km ; and (4) the event is most likely to be a small-moderate earthquake.
Evolutionary magnitude estimation. As more information is needed for situations (2) and (3) Supplementary Table S4. The weights of P d 10km and τ c for PTW = i s and the magnitude estimate of a single station based on these weights was obtained as follows.
where σ iτ c(iPd 10km ) , W iτc(iPd 10km ) and M iτc(Pd 10km ) are the underestimation error of M = 8 events, the weights and the magnitude estimates of τ c (P d 10km ) for PTW = i s, respectively. The final magnitude was computed based on the results of multiple triggered stations in a seismic network. Since numerous distant stations with short PTWs will tend to lower the estimates of the average magnitude, we obtained average values based on the length of the PTW for each station as follows.
where M i and PTW i are the predicted magnitude and PTW, respectively, for the i-th triggered station.
For clarity, we have summarized the steps of our evolutionary magnitude estimation process in a flow chart (Fig. 6).

Tests and analyses of robustness.
To further investigate the reliability of the established relationships and our threshold-based approach, we applied our method to the validation dataset to evaluate the magnitude estimates, and we demonstrated the evolutionary magnitude estimation algorithm for the largest earthquake considered herein: the Ms 8.0 Wenchuan earthquake. Figure 7 illustrates our magnitude estimation results for the 13 earthquakes that occurred during 2016-2017. We computed both P d 10km and τ c only when data were available for PTW = 2 s. After comparisons with the corresponding thresholds (dashed lines), magnitude estimation was undertaken by following the threshold-based magnitude estimation steps presented in Fig. 6. We compared the average predicted magnitude with the catalog (5) W iτ c (iPd 10 km ) = 1/σ iτ c (iPd 10 km ) 1/σ iτ c + 1/σ iPd 10 km ; M = W iτ c * M iτ c + W iPd 10 km * M iPd 10 km www.nature.com/scientificreports/ magnitude and found that P d can provide robust magnitude estimates with a low standard deviation for all small events. In addition, the inclusion of τ c allows the magnitude for large events to be more precisely estimated than using P d alone: our method produced estimates of 6.72 and 6.33 for the M = 7 Jiuzhaigou earthquake with PTW = 2 s for the estimation with and without τ c , respectively. These results indicate that our method is stable and that it performed well for all 13 earthquakes considered with an average magnitude estimation error of 0.22 magnitude units for both PTW = 2 s and PTW = 3 s. Figure 8 illustrates the evolutionary magnitude estimation results for the Wenchuan earthquake. Nearly 6.2 s after the origin time (OT), the first triggered station (051WCW) received 2 s of data after the P-wave arrival. Accordingly, because the τ c value did not reach the corresponding threshold, only P d 10km was used, and the proposed method yielded a magnitude estimate of 6.7. Approximately 8.5 s after the OT, both P d 10km and τ c exceeded their respective thresholds at the second triggered station (051PXZ), yielding a magnitude estimate of 8.13 at this station when PTW = 2 s. After averaging the estimates from the first two stations, a magnitude estimate of 7.41 was obtained just 8.5 s after the OT, reflecting a smaller underestimation error of 0.59 compared to the result (1.47) of using P d 10km only (dashed blue line). The third station (051PXZ) exhibited small displacement amplitudes at 2 and 3 s; these displacements were insufficient to reach P d 10km threshold. Accordingly, the estimates www.nature.com/scientificreports/ using P d 10km only for this station lowered the average magnitude for the three stations 10.3-12.3 s after the OT. We included the average estimates without considering the PTW length for all three stations (black circles) in Fig. 8 for a comparison with our averaged values based on the length of the PTW (red line). By averaging the estimates for the stations based on the PTW, the magnitude estimation error was reduced from 1 to 0.7 magnitude units when the third station had a short PTW (2 s). With increasing PTW, the final estimates converged to 7.74 at 12.3 s after the OT.

Discussion and conclusions
We investigated two earthquake magnitude estimation parameters (τ c and P d ) based on CSMNC data during 2007-2015 for the Sichuan-Yunnan region. In particular, we established a τ c -magnitude relationship with a fixed PTW of 3 s and a P d 10km -magnitude relationship with increasing PTW (2-10 s). Based on the derived www.nature.com/scientificreports/ relationships, we proposed a threshold-based evolutionary approach for estimating the event magnitude within seconds of the P-wave arrival.
Our study highlights the shortcomings of existing methods, for which the applications of P d and τ c are limited by the length of the PTW and uncertainty, respectively. Although P d produces a consistent, small error for small-moderate earthquakes, the complex moment rate function and incomplete rupture typical of large events preclude a simple prediction of the final magnitude using P d in the first few seconds of an event. The PTW must be extended sufficiently to capture the information released by large earthquakes. Conversely, τ c exhibits better performance in predicting high event magnitudes for short PTWs. However, we observed large scatter of τ c in the low-magnitude range, which is consistent with the results of Zollo et al. 16 . Since τ c is more sensitive than P d to the low-frequency drift caused by integration, we retained the 0.075 Hz high-pass filter for P d and applied a criterion that allows a filter with a higher cutoff frequency to be selected for τ c based on the P v threshold (0.05 cm/s); this selective criterion reduces such errors considerably.
Our proposed threshold-based methodology is based primarily on P d results and setting a threshold to ensure that τ c participates in the magnitude estimation process only for large earthquakes. Previous studies have proposed using τ c and P d together to improve reliability, but the occurrence of abnormally high τ c values for small events will introduce errors when combining these two parameters. Our approach applies an advance judgment (compared with a predefined threshold) and implements reasonable strategies based on the levels of the observed parameters. Specifically, situation (1) indicates that an event is most likely a large event, so we jointly use the estimates of τ c and P d 10km . Situation (2) indicates that an event may be a large earthquake with a small P d 10km or a small-moderate earthquake with an unexpectedly large τ c . To avoid the potentially serious consequences of failing to trigger an alarm for a large earthquake, we suggest extending the PTW and using only P d 10km for the magnitude estimation in this scenario. Although a small P d 10km will underestimate a potential large event, this underestimation can be improved by extending PTW, which helps mitigate the magnitude overestimation for potential small-moderate earthquakes caused by combining unexpectedly large τ c values. Situation (3) indicates that an event may be a large earthquake with an unexpectedly small τ c or a small-moderate earthquake with a large P d 10km . Considering the small error of P d 10km found in the small-moderate magnitude range and the underestimation of potential large earthquakes caused by jointly using small τ c values, we also suggest extending the PTW to avoid excluding large earthquakes and using only P d 10km for the magnitude estimation in this case. Situation (4) indicates that an event is most likely a small-moderate event, so we use the estimates of only P d 10km and cease extending PTW beyond 3 s. Accordingly, this approach integrates the timeliness of estimating large events based on τ c and the robustness of general magnitude estimation based on P d ; moreover, the proposed method reduces the impacts of the uncertainty in τ c and the magnitude saturation of P d . We believe this approach provides the most accurate estimate in the shortest time.
Offline applications to 13 events that occurred during 2016-2017 and the Ms 8.0 Wenchuan earthquake demonstrated the robustness and feasibility of our proposed methodology. According to the simulated results, the average magnitude estimation error of 0.2 magnitude units can be achieved using a PTW of 2 s for earthquakes with M ≤ 7. For the largest event considered herein, namely, the Ms 8.0 Wenchuan earthquake, a relatively stable magnitude estimate of 7.74 was obtained 12.3 s after the OT. Our approach offers clear advantages over the prediction approach using P d 10km alone. Moreover, we show that the pull-down effects of distant stations with short PTWs can be mitigated by averaging the magnitude estimates based on the lengths of the records available at those stations, thereby reducing the weights of stations with short PTWs. Since the underestimation caused by the third station (051PXZ) was so obvious, for future real-time application, we suggest retaining the previous magnitude estimation results without introducing low-amplitude stations if the event is judged most likely to be a large earthquake based on the near-source stations. It is important to note that the distances used in this study were based on the post-location. Accordingly, the real performance will be affected by the error in locating earthquakes in real-time.
In addition to robustly declaring the event magnitude, it is equally important for an EEW system to send an alert for the largest lead time (i.e., the time interval between the alarm being raised and the arrival of destructive S waves at the target site). With regard to the timeliness, the proposed method needs only to obtain the initial information after P-wave arrivals, following which the method can automatically choose a specific strategy based on pre-established thresholds and magnitude relationships without complex computations; hence, this magnitude estimation method requires little execution time. Typically, the lead time is controlled by the difference in the P-S-wave traveltimes. Assuming a mean station spacing of 20 km within a network ensures that 3 stations are triggered within 5 s of the OT 10 , and our approach will provide a robust magnitude estimate for M < 6.5 earthquakes within 8 s after the OT. Therefore, assuming Vs = 3.5 km/s, the lead time corresponding to a specific hypocentral distance (R) between the epicenter and target will be given by (R/3.5-8) s. A lead time of 6 s is available for a hypocentral distance of 50 km, and 20 s is available at 100 km. Specifically, for the Ms 8.0 Wenchuan earthquake, a relatively stable magnitude estimate of 7.74 can be obtained 12.3 s after the OT by our approach. Note that the region that was severely damaged by the Wenchuan earthquake (Beichuan) was almost 90 km away from the epicenter 31 ; our method would provide the people in Beichuan a lead time of 13.4 s. Indeed, expanding the PTW of near-source stations cannot avoid the inclusion of S-waves; however, stable magnitude estimates were provided by our approach without severe overestimation caused by the contamination of S waves.
The proposed method was conceived to be applied to a regional EEW system and to provide robust source parameters (magnitude) with continuous information from a dense network. Our approach begins to provide a magnitude estimate after the first station is triggered, similar to the single-station, threshold-based onsite approach, which quickly provides approximate warnings on whether the following ground motion will be damaged. Similarly, our approach uses a four-entry decision table composed of τ c and P d 10km thresholds to increase the reliability of the preliminary determination on whether the event is large, and based on the judgment for selectively applying period and amplitude parameters to predict the magnitude at the early stage of earthquake Scientific Reports | (2020) 10:21055 | https://doi.org/10.1038/s41598-020-78046-2 www.nature.com/scientificreports/ occurrence. When continuous waveform data from multiple stations are streamed to the network processing center, our approach combines the information from all available stations and chooses whether to extend the PTW to update the source parameters to provide the most robust warning, similar to a regional (network-based) warning. We believe this method can be widely used in other geographical regions to improve the robustness and timeliness of magnitude estimation. Accurate magnitude estimations closely related to the determination of a potential damage area will help those areas that will not receive alarms due to an underestimated magnitude; this will be conducive to reducing earthquake hazards and providing guiding to the emergency response a few seconds in advance.

Methods
An automatic P-wave picker described by Allen 32 was used to detect P-waves from vertical acceleration records, and the waveforms were checked manually to correct the P-wave arrivals as necessary. The acceleration signals were integrated once (twice) with respect to the velocity (displacement). A four-pole 0.075 Hz high-pass filter was used to remove low-frequency drift in the velocity and displacement. The peak amplitude P d is the maximum absolute value of the vertical displacement in the first several seconds after the P-wave arrival. We applied an extended P-wave time window (2-10 s) to obtain the coefficients for the P d attenuation relationship through least-squares multi-regression analysis: where P d , M and R are the peak displacement (cm), magnitude and hypocentral distance (km), respectively. To retrieve the magnitude dependence of P d , we normalized P d to a reference distance of 10 km to obtain P d 10km and calibrated the linear regression relationship between the event-averaged P d 10km and magnitude. The parameter τ c is defined as follows.
where u(t) and v(t) are the vertical displacement and velocity, respectively. Here, τ 0 is the time window over which τ c is computed, starting at the first P-wave arrival and typically equaling 3 s. We averaged the τ c values for all records for each event and used the least-squares method to fit an event-averaged τ c value; this reduced scatter induced by site conditions and radiation patterns.