Complex System Analysis of Korean Peninsula Earthquake Data

Earthquakes are natural disasters that cause damage in a wide range of regions and represent a complex system that does not have a clear causal relationship with specific observable factors. This research analyzes the earthquake activities on the Korean Peninsula with respect to spatial and temporal factors. Using logarithmic regression analysis, we showed that the relationship between the location of the earthquake and its frequency in these locations follows a power law distribution. In addition, we showed that since 1998 the average earthquake magnitude has decreased from 3.0143 to 2.5433 and the frequency has risen by 3.98 times. Finally, the spatial analysis revealed significantly concentrated earthquake activities in a few particular areas and showed that earthquake occurrence points have shifted southeast. This research showed the change in earthquake dynamics and concentration of earthquake activities in particular regions over time. This finding implies the necessity of further research on spatially-derived earthquake policies on the change of earthquake dynamics.

Korean Peninsula, four significant deformation episodes occurred in the fault, and the dextral slip faulting in the late Paleogene had a significant effect on the deformation. Due to the deformation of the Yangsan fault, it was temporarily reactivated 42 .
This research considers spatiotemporal perspectives. We identified the relationship between the spatial-location and the frequency of earthquakes and its activities in space, along with an analysis of earthquake activity changes according to time. Spatial data was created based on the longitude and latitude of earthquakes that occurred in the Korean Peninsula. While previous studies divided the space into equal areas based on the rock mass, this study creates rectangular grid-sections based on the extreme latitude and longitude points of the earthquake. An analysis of the relationship between the grid-sections and the frequency of earthquakes in these grid-sections is conducted. In addition, the Gutenberg-Richter law is established between the magnitude and frequency of earthquakes by period, and the change in earthquake activity by period is identified to assess the overall change of earthquake activity within the Korean Peninsula.

Results
Analysis of earthquake activities in the Korean Peninsula considering spatial factors. First, to verify that the earthquakes are intensively concentrated in a few areas, the Korean Peninsula was equally divided using a 40 × 40 uniform grid forming 1,600 spatial reference points. Then, the frequency of earthquakes per grid-section and the distribution of each grid-section was verified using logarithmic regression.
More than two-thirds of the Korean peninsula consist of granite and metamorphic rock. The granite mainly consists of Jurassic and Cretaceous granite, Daebo granite, and Bulguksa granite. Metamorphic rocks are mainly composed of gneiss and include shale, sandstone, and limestone. The primary sedimentary rocks are shale, sandstone, conglomerate, and limestone, and they are mainly distributed in the Gyeongsang Basin, including Gyeongsang-do Province. In the Cretaceous basins, including the Gyeongsang Basin, sediments, volcanic rocks, and tuff appear 43 . Although 1,733 earthquakes greater than M L 2 occurred in the Korean Peninsula, Fig. 1(a) shows that out of the 1600 grid-sections, 459 spaces experienced more than one earthquake, indicating the earthquakes did not occur spatially uniform. Most of the earthquakes observed in the Korean Peninsula can be identified to occur in either Gyeongsang-do Province in the southeast of the Korean Peninsula, or Hwanghae-do Province in the northwest. Gyeongju and Pohang, where the two strongest earthquakes occurred, particularly show frequent earthquake activities. A total of 535 earthquakes were identified in Gyeongsang-do Province, resulting in the highest frequency of earthquakes-31% of the total for the Peninsula. The logarithmic regression, coefficients of determination (R 2 ), between the grid-section and the frequency of earthquakes was 0.7674, showing a high correlation between space and frequency. This suggests that the earthquake frequency of each grid-section and the number of grid sections follow a power law distribution. In other words, statistically, we prove that even if there are a large number of earthquakes, they do not occur spatially uniform, but rather tend to concentrate in specific regions.  Table 1. The fitness of the logarithmic-regression distribution for magnitude and frequency in the entire period was significant because the p-value was less than 0.0001. In other words, the Gutenberg-Richter law was established for the entire earthquake of the Korean Peninsula. After that, to confirm the establishment of the Gutenberg-Richter law by time, we moved the time window only in three-year steps and overlapped the time windows over the entire period. As a result of overlapping, the coefficient of determination of the power law distribution between the magnitude and frequency of earthquakes was less than 0.5 before 1998. In some cases, the p-value was higher than 0.05, and the Gutenberg-Richter law was not established. However, since 1998, the power law distributions consistently showed coefficients of determination above 0.5. The coefficients of determination from 1978 to 1997 and 1998 to 2018 were 0.6247 and 0.9025, respectively, indicating a significant difference between the periods. The reason for the large difference in the coefficients of determination of the power law between magnitude and frequency by period is that earthquakes before 1998 had a relatively low frequency. The fitness of the logarithmic regression model over the entire period was also significant, with a coefficient of determination of 0.9029, which showed a clear power law distribution between magnitude and frequency.

Analysis of earthquake activity in the Korean
Because there was a significant difference in the coefficient of determination of the power law, coefficients of determination from before and after 1998, a visualization, and further analysis of frequency and magnitude were established for these two periods.
Even though the earthquake activity observation period was divided in half, the earthquake activity record since 1998 accounted for about 80% of all earthquake activities. This is depicted in Fig. 3. From 1998-2018, seismic activity was concentrated in a small area in the southern part of the Korean Peninsula, mainly near Gyeongsang-do Province, confirming that the average location center of earthquakes has moved southeasterly. Table 2 shows the results of the t-test to determine the seismic changes in the Korean Peninsula over the two periods. Welch's t-test, which is a heteroscedastic test, was performed because the two equal time variance assumptions did not hold during the t-test. The t value was found to be high at 13.782, indicating that there is a statistically significant difference in the magnitude of earthquakes between before and after 1998.

Discussion
In this study, the earthquake phenomena of the Korean Peninsula were analyzed in consideration of spatiotemporal factors. To take into account spatial factors, the Korean Peninsula was divided into 1,600 equal sections using a grid. The logarithmic regression then verified that the relation between the grid section where the earthquake occurred and the frequency(number) of earthquakes in that grid-section follows the power law distribution. In another analysis, the earthquake observation period was divided to consider the temporal factors and to determine whether or not the Gutenberg-Richter law was established by period. As a result, earthquakes occurring after 1998 showed 1.7372 times higher coefficients of determination of the power law distribution than before, and the t-test confirmed that the magnitude of earthquakes occurring after 1998 was smaller than of those occurring earlier. In other words, according to the power law distribution between the grid section where the earthquake occurred and the frequency of earthquakes in that grid-section, the earthquakes are concentrated in www.nature.com/scientificreports www.nature.com/scientificreports/ specific regions. Hence, it was confirmed that many earthquakes occurred in Gyeongsang-do Province. In addition, since 1998, the power law distribution between the magnitude and frequency of earthquakes had a high coefficient of determination, and the average number of earthquakes has increased significantly. This increase in the number of earthquakes resulted in a number of small-scale earthquakes; also a few large-scale earthquakes  www.nature.com/scientificreports www.nature.com/scientificreports/ occurred. In particular, since 1998, five out of seven earthquakes of M L 5 or higher have occurred, and 80% of the total earthquakes occurred. Moreover, Due to the concentration of earthquakes in specific regions, earthquakes of M L 5 or higher also occurred five times in Gyeongsang-do Province alone. As a result, large-scale earthquakes can occur at any time in specific regions in which earthquakes are concentrated.
The earthquake spatial data of this study were generated in rectangular grid-sections based on latitude and longitude. This is different from previous studies that have formed earthquake networks. This study presents ideas from the spatial point of view of the earthquakes. We argue that follow-up studies need to consider spatial factors due to the concentration of earthquakes in specific regions. Moreover, we revealed that the pattern of earthquakes differs significantly over time windows, which may be useful for earthquake-related social policymaking and follow-up studies.

Materials and Methods
This study used the earthquake magnitude and location data of the Korean Peninsula from 1978 to June 2018 from the Weather Data Open Portal of the Korea Meteorological Administration. There is a total of 264 earthquake measuring stations in the Republic of Korea. The types of stations are classified into 95 broadband seismometers, 27 short-period seismometers, and 142 accelerometers. The collected data includes the date, magnitude, epicenter, and latitude and longitude of the earthquakes. Of the 4,107 datasets, 31 datasets with missing information on location or magnitude of earthquakes were removed. In addition, since the earthquake data from 1978 to 1998 were collected by analog observations of earthquakes with a magnitude greater than M L 2, only datasets of events of M L 2 or higher were used after 1999, when digital observation was implemented. The statistical lower limit for earthquakes on the Korean Peninsula suggested by the Korea Meteorological Administration is more than M L 2. Therefore, we excluded micro-earthquakes of less than M L 2 from the dataset. After such filtering, a total of 1,733 datasets were available and used for this study. Table 3 lists the descriptive statistics of earthquakes that occurred on the Korean Peninsula, including the standard deviation (S.D.) for each value. As only earthquakes of M L 2 or higher were used, the minimum earthquake magnitude value was M L 2, the average value M L 2.64, and the maximum value M L 5.8. The latitude of the earthquake region ranged from N32.35° to N41.60° and the longitude ranged from E122.8° to E131.1°, covering the entire Korean Peninsula. www.nature.com/scientificreports www.nature.com/scientificreports/ To consider spatial factors, the entire area of the Korean Peninsula was divided into a uniform grid with 40 rows and 40 columns. This is depicted in Fig. 4. First, this study analyzed the frequency of earthquake occurrence in each grid-section followed by mapping the frequency of earthquake occurrences in various grid sections. Moreover, this study considered the relationship between the magnitude and frequency of earthquakes by period.
The distribution of the frequency of earthquakes in each space and the number of spaces corresponding to a specific number of events were confirmed by logarithmic regression (Eq. 1). The dependent variable was the number of grid sections in which earthquakes occurred, and the independent variable was the frequency (total number) of earthquakes in a grid section. Sections where no earthquake occurred were removed, and only the sections where at least one earthquake occurred were used. Second, the analysis of magnitude and frequency to confirm the establishment of the Gutenberg-Richter law by period was conducted in the same way as the spatial analysis except that the independent variable was the magnitude of the earthquake and the dependent variable was the frequency of earthquakes by time windows. Finally, a t-test was performed to determine any overall change in magnitude of the earthquakes over time. The stats package in R version 3.5.1 was used for all analysis.
The logarithmic transformed regression and t-test equations used in the two analyses are as follows:   In Eq. (1), y is a dependent variable and x is an independent variable. αi s a regression coefficient for estimating the y intercept, ε is the error term, and n is the number of data. β is a coefficient that reflects the influence of log(x) and measures the change rate of y according to the fine change rate of x. In Eq. (2), t is the average difference of the two groups by the data deviation calculated from the two groups. A and B denote each group, S is the standard deviation, and X is the average.