Can Subtle Shifts Predict the Big One? A New Approach to Earthquake Forecasting

Author: Denis Avetisyan


A new study explores whether deep learning, combined with analysis of the Gutenberg-Richter b-value, can offer a marginal improvement in predicting earthquake occurrences.

Model rescaling, evaluated using the Brier Skill Score (BSS) as a function of a multiplicative scaling factor and the mean number of earthquakes <span class="katex-eq" data-katex-display="false">\overline{n\_{\mathrm{eq}}}</span>, demonstrates that empirical BSS-based rescaling-obtained by fitting along the ridge of positive skill-and logit-based prior correction consistently outperform uncorrected methods, with the latter further refined by an offset calibrated to optimize BSS, as evidenced by comparative performance across both training and independent test epochs and substantiated by the distribution of <span class="katex-eq" data-katex-display="false">\overline{n\_{\mathrm{eq}}}</span> for all space-time samples and those culminating in earthquakes with <span class="katex-eq" data-katex-display="false">M\_{W} \geq 5</span>.
Model rescaling, evaluated using the Brier Skill Score (BSS) as a function of a multiplicative scaling factor and the mean number of earthquakes \overline{n\_{\mathrm{eq}}}, demonstrates that empirical BSS-based rescaling-obtained by fitting along the ridge of positive skill-and logit-based prior correction consistently outperform uncorrected methods, with the latter further refined by an offset calibrated to optimize BSS, as evidenced by comparative performance across both training and independent test epochs and substantiated by the distribution of \overline{n\_{\mathrm{eq}}} for all space-time samples and those culminating in earthquakes with M\_{W} \geq 5.

Deep learning analysis of spatiotemporal b-value variations shows limited but measurable improvement over baseline earthquake forecasting models.

Despite persistent challenges in reliably predicting earthquakes, subtle variations in seismic activity may hold predictive potential. This study, ‘Probabilistic and Alarm-Based Evaluation of a b-Value-Driven Deep Learning Earthquake Forecast’, rigorously assesses a deep learning model utilizing spatiotemporal changes in the Gutenberg-Richter b-value to forecast earthquake occurrence. Results demonstrate marginal, yet consistent, improvements in forecasting skill-evaluated via Brier Skill Scores and alarm-based metrics-beyond a simple spatial base rate, suggesting that b-value variations contain a limited but detectable signal. Can continued refinement of these data-driven approaches, combined with physics-based modeling, ultimately lead to a significant advancement in probabilistic earthquake forecasting?


The Persistent Challenge of Seismic Prediction

The pursuit of reliable earthquake forecasting represents a persistent and formidable challenge within geophysics. While statistical methods, such as analyses of earthquake frequency and magnitude, have long served as foundational tools, their predictive power remains limited by the inherent complexity of Earth’s crust. These traditional approaches often struggle to differentiate between routine seismic activity and the subtle precursors that might indicate an impending large-scale event. The difficulty lies not in simply registering earthquakes – seismographs excel at this – but in identifying meaningful patterns within the seemingly random distribution of seismic events and accurately assessing the probability of a future rupture, a task complicated by the vast number of variables influencing fault behavior and the limited historical record of large earthquakes in many regions.

The Gutenberg-Richter Relation, a cornerstone of seismology, establishes a logarithmic relationship between the magnitude and total number of earthquakes, offering a foundational understanding of seismic activity. However, this statistical regularity, while useful for characterizing overall earthquake frequency, proves insufficient when attempting to predict specific events in both space and time. Seismicity isn’t merely a random process governed by probability; it’s influenced by a complex interplay of factors including stress accumulation along fault lines, the precise geometry of those faults, pore fluid pressure, and even subtle precursory signals. These dynamic elements create patterns that deviate significantly from the simple statistical predictions offered by the Gutenberg-Richter Relation, necessitating more sophisticated models capable of capturing the nuanced spatiotemporal evolution of earthquake sequences and addressing the inherent limitations of relying solely on historical earthquake counts.

The 2011 Tohoku Earthquake Sequence served as a stark reminder of the inadequacies of earthquake prediction models heavily reliant on past seismic activity. Prior to the event, the region wasn’t considered exceptionally high-risk based on historical data alone, yet it experienced a magnitude 9.0 earthquake and subsequent devastating tsunami. This sequence demonstrated that infrequent, large-magnitude events can defy predictions based on patterns observed in more common, smaller earthquakes. Consequently, research shifted towards incorporating real-time data – including subtle ground deformations, changes in groundwater levels, and electromagnetic signals – to develop more comprehensive forecasting tools. The challenge now lies in integrating these diverse datasets with historical records and advanced computational modeling to identify precursory signals and ultimately improve the accuracy and reliability of earthquake predictions.

Earthquake triggering probabilities, derived from an ETAS model and visualized with a time-magnitude plot, highlight events above an inversion threshold (colored by logits transformation) and those below it (shown in black with low opacity).
Earthquake triggering probabilities, derived from an ETAS model and visualized with a time-magnitude plot, highlight events above an inversion threshold (colored by logits transformation) and those below it (shown in black with low opacity).

Deep Learning: Unveiling Non-Linear Seismic Relationships

Deep learning methods are applied to seismic data analysis to identify non-linear relationships and intricate patterns often missed by conventional statistical approaches. Traditional seismic analysis relies heavily on techniques like spectral analysis and time-series modeling, which assume data linearity and Gaussian distributions. However, seismicity is a complex phenomenon exhibiting inherent non-linearity and influenced by numerous interacting factors. Deep learning, particularly through the use of artificial neural networks, can model these complexities without explicit assumptions about the underlying data distribution, enabling the discovery of subtle indicators of seismic activity and improved characterization of subsurface structures. This capability extends beyond simple event detection to include tasks such as automated phase picking, earthquake location refinement, and the identification of previously unrecognized fault systems.

The system’s architecture leverages Convolutional Neural Networks (CNNs) due to their proven efficacy in spatial data analysis, specifically identifying patterns within seismic imagery and maps. To incorporate temporal information, the CNNs are integrated with a Waveform Analysis module which processes raw seismic waveform data. This module extracts features such as arrival times, amplitudes, and frequency content, converting them into quantifiable parameters. These parameters are then concatenated with the spatial data, providing the CNN with a comprehensive dataset that captures both where seismic events occur and the characteristics of the corresponding seismic waves. This combined approach enables the model to identify subtle correlations between spatial patterns and waveform features, improving the accuracy of seismic interpretation.

The Hybrid Convolutional Architecture employed in our seismic analysis model combines 2D convolutional layers for spatial feature extraction with 1D convolutional layers processing temporal sequences of waveform data. This design allows the model to simultaneously analyze the spatial distribution of seismic events and the temporal evolution of individual waveforms. Specifically, 2D convolutions identify patterns in event locations – clustering, alignment with geological features – while 1D convolutions analyze the time-series data from each event, capturing characteristics like arrival times, amplitudes, and frequency content. The outputs of these parallel convolutional branches are then fused, enabling the model to learn relationships between spatial patterns and temporal waveform characteristics within seismicity fields, improving event detection and characterization.

Progressive Training: Adapting to Evolving Seismic Patterns

The forecasting model employs a progressive training scheme, meaning the model is not trained on a static dataset but is continuously updated as new earthquake data becomes available. This is achieved through a rolling window approach, where the model is retrained periodically using the most recent data, effectively incorporating new information and adapting to evolving seismic patterns. Each retraining cycle utilizes a defined window of historical data, and the model’s weights are adjusted to minimize prediction errors on this window. This continuous refinement process allows the model to improve its predictive accuracy over time and maintain relevance in a dynamic geological environment. The frequency of retraining and the size of the rolling window are determined through hyperparameter optimization to balance model responsiveness and stability.

The forecasting model incorporates bb-Value Fields, calculated from data within the ISC Earthquake Catalog, to represent spatial seismicity patterns. These fields quantify the rate of earthquake occurrence in defined geographic regions, providing a standardized measure of seismic activity. Specifically, the ISC catalog, a globally comprehensive record of earthquake events, is utilized to determine the density of past earthquakes within a specified radius around each grid point. Higher bb-Values indicate areas with historically greater seismic activity, effectively capturing the spatial distribution of earthquake risk and serving as key input features for the forecasting algorithm. This approach allows the model to leverage the historical record to assess the relative likelihood of future events based on spatial context.

The forecasting model generates an Anomaly Score, a dimensionless value intended to represent the relative likelihood of seismic activity within a defined spatial and temporal window. This raw Anomaly Score is then subjected to a Logit Transformation – a mathematical function f(x) = \log(\frac{x}{1-x}) – to constrain the output range and improve interpretability. The Logit Transformation effectively converts the probability-like Anomaly Score into log-odds, enhancing the model’s ability to discriminate between low and high-risk areas and facilitating its integration with other probabilistic hazard assessments.

The Brier Skill Score (BSS) map, indicating forecast accuracy, excludes areas with no earthquake activity (black) and highlights locations of <span class="katex-eq" data-katex-display="false">M_{W} \geq 5</span> earthquakes in the test set, with overall mean BSS values reported in the top left.
The Brier Skill Score (BSS) map, indicating forecast accuracy, excludes areas with no earthquake activity (black) and highlights locations of M_{W} \geq 5 earthquakes in the test set, with overall mean BSS values reported in the top left.

Evaluating Forecast Accuracy and Assessing Broader Implications

The accuracy of probabilistic earthquake forecasts hinges on robust evaluation metrics, and the Brier Score serves as a critical tool in this assessment. This score quantifies the difference between predicted probabilities and observed outcomes, effectively measuring the reliability of a forecast. Unlike simple yes-or-no accuracy, the Brier Score considers the confidence of each prediction; a forecast consistently assigning high probability to events that don’t occur is penalized more heavily than one with lower confidence but greater overall correctness. A lower Brier Score indicates a more accurate probabilistic forecast, providing a nuanced understanding of predictive skill beyond basic hit rates and false alarms. Its widespread adoption in meteorological and climatological forecasting underscores its value, and its application to earthquake prediction allows for meaningful comparisons between different forecasting methodologies and a clearer picture of their potential for mitigating seismic risk.

Molchan diagrams offer a compelling visual assessment of alarm-based earthquake forecasting skill, moving beyond simple accuracy metrics. These diagrams plot the percentage of earthquakes captured against the fraction of alarms issued, revealing the trade-off between maximizing earthquake detection and minimizing false alarms. A truly skillful forecast will exhibit a curve that lies consistently above the baseline of random chance, indicating a higher probability of capturing events for any given alarm rate. By analyzing the area under the curve, researchers can quantify the overall performance of the forecasting model and compare it to other approaches or benchmark forecasts, providing a nuanced understanding of its reliability and practical utility in earthquake early warning systems.

Evaluation of the deep learning model reveals a Brier Skill Score (BSS) of 0.000682 when averaged across all assessed grid cells. This metric quantifies the model’s predictive ability relative to a simple forecast based solely on the spatial frequency of earthquakes – essentially, a prediction that earthquakes occur with uniform probability across the region. While the achieved BSS is a modest improvement, it indicates that the model, despite its complexity, is capable of providing some incremental skill in forecasting earthquake probabilities beyond what would be expected by chance alone. Further analysis focused on grid cells that experienced magnitude 5 or greater earthquakes during the testing period demonstrated an even smaller, but still positive, BSS of 0.000197, suggesting the model’s limited ability to accurately forecast events in areas with higher seismic activity.

The deep learning model demonstrated a nuanced predictive capability when evaluated specifically across regions that experienced magnitude 5 or greater earthquakes during the testing phase, achieving a Brier Skill Score (BSS) of 0.000197. This localized performance metric suggests the model possesses a slight ability to refine predictions in seismically active zones, even if the overall improvement averaged across all grid cells remains marginal. While a BSS close to zero indicates limited skill, this value highlights the potential for further optimization tailored to areas with higher earthquake frequency, offering a pathway towards more effective forecasting in critical regions. The score signifies a small, yet measurable, advantage over simply predicting earthquakes based on their overall spatial occurrence rate.

The forecasting model demonstrates a quantifiable ability to identify a portion of significant seismic events, specifically magnitude 5 or greater earthquakes. When triggered only 1% of the time – representing a conservative alarm rate – the model successfully flags 5.88% of these events. Increasing the alarm fraction to 5% – meaning the model issues alerts more frequently – enhances its capture rate to 15.29%. This relationship between alarm frequency and event capture is crucial for practical application, suggesting a trade-off between minimizing false alarms and maximizing the detection of potentially damaging earthquakes, and highlighting the model’s capacity to provide actionable, albeit imperfect, warnings.

The Molchan diagram for Model 4.9 demonstrates alarm rates of 1% and 5%, alongside the area under the curve (AUC) with corresponding <span class="katex-eq" data-katex-display="false">95%</span> confidence intervals.
The Molchan diagram for Model 4.9 demonstrates alarm rates of 1% and 5%, alongside the area under the curve (AUC) with corresponding 95% confidence intervals.

The pursuit of earthquake forecasting, as detailed in this study, demands rigorous evaluation metrics. While deep learning models offer a potential avenue for improvement by analyzing spatiotemporal variations in the Gutenberg-Richter b-value, the probabilistic skill remains constrained. This echoes a fundamental truth about predictive endeavors: mere correlation is insufficient. As Galileo Galilei observed, “You cannot teach a man anything; you can only help him discover it himself.” The model doesn’t create forecasting ability; it reveals patterns within the data, highlighting the enduring need for mathematical discipline in interpreting the chaos of seismic activity and validating any proposed predictive capability. The Brier Score and Molchan Diagram, employed in the study, serve as crucial tools in this process of discovery.

The Road Ahead

The marginal gains observed in forecasting skill, while statistically demonstrable, compel a rigorous re-evaluation of the underlying assumptions. The pursuit of improved earthquake prediction is not merely an exercise in statistical refinement; it is a search for the deterministic signals obscured within a fundamentally chaotic system. The b-value, as a proxy for stress regime, remains an inherently coarse descriptor. Future work must prioritize models capable of incorporating higher-order spatial and temporal dependencies, ideally grounded in physically plausible representations of fault mechanics, rather than relying on purely data-driven approaches.

A critical path forward lies in minimizing algorithmic redundancy. The current paradigm often favors complexity in the hope of capturing subtle nuances, but each added parameter introduces a potential for overfitting and abstraction leaks. A truly elegant solution will likely be parsimonious, deriving maximum predictive power from minimal input – a difficult, but not impossible, goal. The Molchan diagram, in its stark depiction of forecasting limits, serves as a constant reminder of the inherent difficulty, and the need for intellectual honesty in evaluating progress.

Ultimately, the value of this research resides not in achieving definitive prediction, but in forcing a more precise articulation of the problem itself. The quest for earthquake forecasting is, at its core, a mathematical challenge – a test of humanity’s ability to extract order from noise. Further exploration must relentlessly pursue solutions that are not merely ‘good enough’, but demonstrably, provably, correct.


Original article: https://arxiv.org/pdf/2603.03079.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-03-04 10:30