News related to Science, Technology, Environment, Agriculture and Medicine in India

Predicting Induced Seismicity

Accuracy of machine learning 

There have been many minor earthquakes in the Groningen Gas Field, a natural gas reservoir in the Netherlands. This is attributed to stress changes during natural gas extraction that lead to geodynamic rearrangements. It is important to predict such induced seismicity lest exploitation of natural gas or other resources lead to destructive earthquakes.

To predict such events, two machine learning methods have been reported: logistic regression and deep learning using convolutional neural networks. But we cannot compare results from models based on logistic regression with models based on deep learning, because existing work using these models took data from different datasets drawn from the seismic detection network in the region.

So, Akshat Goel, Rocket Learning, Delhi collaborated with Denise Gorse from the UK to overcome the problem. They used raw seismic data and the metadata of 2300 earthquake events and 4000 non-events in the region.

The seismograms, each covering a 30-second window, were pre-processed and down-sampled for compatibility. The duo divided the data into a 60:20:20 ratio for training, validation, and testing.

For their logistic regression model, they initially selected the same four, uncorrelated, statistically-derived features as used in the earlier logistic regression model – features discovered using the Highly Comparative Time Series Analysis package. This involved comparing time series data, using a correlation matrix to reveal variable relationships, to gain insights into the relationships and the strengths of correlations between multiple variables in the dataset.

But when they built a four-input model, as in the earlier logistic regression model, they noted that the Highly Comparative Time Series Analysis package provided more than 7000 features, out of which the earlier logistic regression model had used only four!

With Catch-22 , a MATLAB package containing 22 most useful features from the Highly Comparative Time Series Analysis package, they identified four more key earthquake-detecting features using elastic net regularisation. These features were added to the initial logistic regression model. 

They compared the performance of their feature-enhanced model with that of the original logistic regression model. The model with eight features performed better than the one with four features.

The researchers then compared the enhanced logistic regression model with the deep learning model which uses the convolutional neural network. They found that, at the signal-to-noise ratio of the earlier work, the performance of the enhanced logistic regression model was comparable to that of the deep learning model and displayed no false negative errors. At lower signal-to-noise ratios, the number of false positive errors made by the logistic regression model increased, but the number of undetected earthquakes remained zero.

Though the convolutional neural network used 283,700 free parameters, the logistic regression model which used only eight features performed better at the highest signal-to-noise ratio and had the potential to perform competitively at lower ratios.

“A logistic regression model is more easily interpretable. In some situations, it could prove more useful than deep learning that works like a black box, preventing insights into why decisions were made,” says Denise Gorse, University College London.

“If we identify more relevant and suitable input features to use in the logistic regression model, perhaps the performance will become even better,” adds Akshat Goel, Rocket Learning – Ekho Foundation, Delhi. 

By demonstrating the potential of logistic regression models for predicting induced seismicity due to gas extraction, it may become easier to predict induced seismicity due to other mechanisms such as rain induced-seismicity in several parts of India and reservoir-induced seismicity in the Koyna-Warna region. Indian researchers now need to test this eight-feature logistic regression model for the purpose.

DOI:  10.1111/1365-2478.13386;
Geophysical Prospecting Online Version: 10 July 2023

*Reported by Chhotu Kumar Keshri
CSIR-NGRI, Hyderabad

*This report was written during the 4th online workshop on science writing organised by Current Science.
All reports on this site, except those in the Archives, are free-to-use for all Indian media outlets.

STEAMindiaReports: providing energy to advance scientific research in India

——-

Tagged as: , , , , , , , , , , ,

Categorised in: Delhi, Earth Sciences, Tectonics

Leave a comment

Follow Us