Machine learning and earthquake forecasting—next steps

A new generation of earthquake catalogs developed through supervised machine-learning illuminates earthquake activity with unprecedented detail. Application of unsupervised machine learning to analyze the more complete expression of seismicity in these catalogs may be the fastest route to improving earthquake forecasting.

uncovered small earthquakes in AI-based catalogs, will inform earthquake forecasting for events of all magnitudes. The observed scale invariance of earthquake behavior suggests this is a reasonable expectation.
Empirical seismological relationships have played a key role in the development of earthquake forecasting. These include Omori's law 3 that describes the temporal decay of aftershock rate, the magnitude-frequency distribution, with the b-value describing the relative numbers of small vs. large earthquakes 4 , and the Epidemic Type Aftershock Sequence (ETAS) model 5 in which earthquakes are treated as a self-exciting process governed by Omori's law for their frequency of occurrence and Gutenberg-Richter statistics for their magnitude. These empirical laws continue to prove their utility. Just in the past few years, the time dependence of the b-value has been used to try to anticipate the likelihood of large earthquakes during an ongoing earthquake sequence 6 and the ETAS model has been improved to better anticipate future large events 7 . So it appears that there is value in applying these longstanding relationships to improved earthquake catalogs, but our opinion is that much more needs to be done.
The relationships cited above date from 127, 77, and 33 years ago. The oldest of them, Omori's Law, was developed based on felt reports without the benefit of instrumental measurements. We suggest that a fresh approach using more powerful techniques is warranted. Earthquake catalogs are complex, high-dimensional objects and as Fig. 1 makes clear, that is even more true for the deeper catalogs that are being developed through machine learning. Their high dimensionality makes them challenging for seismologists to explore, and the conventional approaches noted above seem unlikely to be taking advantage of the wealth of new information available in the new generation of deeper catalogs. We suggest that, having first enabled the development of these catalogs, the statistical-learning techniques of data science are now poised to play an important role in uncovering new relationships within them. The obvious next step is to apply the techniques of machine learning in discovery mode 8 to discern new relationships encoded in the seismicity.
There are tantalizing indications that such an approach may lead to new insights. In double-direct-shear experiments, background signals that were thought to be uninformative random noise have instead been shown to encode information on the state of friction and the eventual time of failure of faults in a laboratory setting 9 . Well-controlled laboratory analogs to faults lack the geologic complexity of the Earth, yet, weak natural background vibrations of a similar sort, that again were thought to be random noise, have been shown to embody information that can be used to predict the onset time of slow slip events in the Cascadia subduction zone 10 . Finally, unsupervised deep learning, in which algorithms are used to discern patterns in data without the benefit of prior labels, applied to seismic waveform data uncovered precursory signals preceding the large and damaging 2017 landslide and tsunami in Greenland 11 .
These examples are compelling but come with the caveat that they are not representative of the typical fast rupture velocity earthquakes on tectonic faults that are of societal concern. For such earthquakes, however, there are also indications from stateof-the-art forecasting approaches that next-generation earthquake catalogs may contain information that will lead to progress. Physics-based forecasting models, which account for changes in the Coulomb failure stress due to antecedent earthquakes that favor the occurrence of subsequent earthquakes, have shown increasing skill such that they are competitive with, and are beginning to outperform, statistical models. Coulomb failure models benefit particularly from deeper catalogs because they include many more small magnitude earthquakes. These small earthquakes add predictive power through their secondary triggering effects tracking the evolution of the fine-scale stress field that ultimately controls earthquake nucleation in foreshock and aftershock sequences. They can also be used to define the emerging active structures that comprise fault networks and by doing so clarify the relevant components of stress that would act to trigger earthquakes 12 . Secondary triggering and background stress heterogeneity were shown to improve stress triggering models 13 but were most effective when they incorporated nearreal-time aftershock data from the sequence as it unfolded 14 . We note that there is no reason why more complete earthquake catalogs, developed with pre-trained neural network models, cannot be created in real time as an earthquake sequence unfolds. Finally, despite the disappointing history of the search for precursors, due diligence requires that seismologists consider the pursuit of signals that might be precursory. We conclude that it is now possible to image the activity on active fault systems with unprecedented spatial resolution. This will enable experimentation with familiar hypotheses and enable the formulation of new hypotheses. It seems certain that the underlying processes that drive earthquake occurrence are encoded in this next generation of earthquake catalogs, but we may not find them unless we put new effort into searching for them. Unsupervised learning methods 15 are particularly well-suited tool for that effort.