SEISMOLOGY AND EGOLOGY ›› 2021, Vol. 43 ›› Issue (5): 1326-1338.DOI: 10.3969/j.issn.0253-4967.2021.05.017

• Application of new technique • Previous Articles     Next Articles

OUTLIER DETECTION METHOD AND APPLICATION OF SATELLITE GRAVITY DATA TRAINED BY LONG SHORT-TERM MEMORY NETWORK

YANG Yu1)(), WU Yun-long1,2),*(), YAO Yun-sheng1), SHAN Wei-feng1)   

  1. 1) Institute of Disaster Prevention, Sanhe 065201, China
    2) Key Laboratory of Earthquake Geodesy, Institute of Seismology of China Earthquake Administration, Wuhan 430071, China
  • Received:2021-01-11 Revised:2021-05-27 Online:2021-10-20 Published:2021-12-06
  • Contact: WU Yun-long

采用长短时记忆网络训练的卫星重力数据粗差探测方法与应用

杨玉1)(), 吴云龙1,2),*(), 姚运生1), 单维锋1)   

  1. 1)防灾科技学院, 三河 065201
    2)中国地震局地震研究所, 中国地震局地震大地测量重点实验室, 武汉 430071
  • 通讯作者: 吴云龙
  • 作者简介:杨玉, 女, 1997年生, 2019年于天津商业大学获应用化学专业学士学位, 现为防灾科技学院资源与环境专业在读硕士研究生, 主要从事灾害信息处理技术方面的研究, 电话: 13163113620, E-mail: yuyang_2021@126.com
  • 基金资助:
    国家自然科学基金项目(41974096);国家自然科学基金项目(41931074);国家重点研发计划(2018YFC1503503)

Abstract:

Outlier detection is a key step in satellite gravity data preprocessing. As the theory and practice of GOCE satellite gravity gradient measurement get more and more sophisticated, the spatial resolution of satellite gravity data can reach the order of 1mgal and the accuracy of 1~2cm. However, due to the interference of various uncertain factors and the characteristics of massive observation, the satellite gravity gradient data often have some outliers. Simulation studies have shown that outliers will adversely affect the interpretation of various physical phenomena. In addition, the existing outlier detection methods have the disadvantages of high time consumption and low accuracy, which reduces the reliability of data analysis and affects the accuracy of the results. Therefore, outliers need to be eliminated. In recent years, with the in-depth development of artificial intelligence technology in earth science research and applications, many new methods and achievements in geoscience have been obtained at home and abroad. Inspired by the fact that long short-term memory networks can capture long-term or short-term information in data sequences, in this paper, a long short-term memory(LSTM)network for outlier detection of gravity gradient data is proposed. This network is a special type of cyclic neural network that can avoid long-term dependence. It adopts the special gate structure of LSTM network, trains the sample characteristics through the calculation of forgetting gate, input gate and output gate, and the LSTM network selectively updates or discards the neuron vector so as to preserve the long-term state of neurons and make LSTM network perform better on long-time series. In order to prove the reliability of extracting outliers by long short-term memory neural network method, the simulated satellite gravity data can be used for the analysis. Firstly, through the 300-order EMG96 model, the normal ellipsoid GRS80 simulates the gravity gradient data with a sampling rate of 5s and a length of 1 day, and by selecting the function whose expected value is equal to 0 and standard deviation is 0.01σ, a white noise sequence is generated, which is randomly added to the gravity gradient data, then adding outlier to the gravity gradient data sequence with a proportion of 1% and a value of 2σ, the gravity gradient data set containing white noise and outlier is obtained; Secondly, the data is normalized to the standard interval by data preprocessing, which is conducive to obtain the optimal solution. Then the gravity gradient data set is divided into training set and testing set according to the proportion of 8:2. After the data are grouped, the network structure is trained to avoid over-fitting and enhance the adaptability of the model to the samples. Through the sliding time window, the data are processed, and the neural network is easier to learn from the data set. Then, the LSTM network constructs the training module, and through the data input layer, the forward and back propagation of training parameters, changes the neuron information. After many iterative processes, the loss function of the LSTM network tends to be stable. Finally, the model is tested through the test set to obtain the final recognition result. In order to obtain higher accuracy, based on the characteristics of neural network, the LSTM continuously updates parameters, increases the complexity and depth of the network, calculates the output value at the current time, and effectively identifies the position of outlier. Compared with the traditional cyclic neural network method, the unit in LSTM records all historical cumulative information and can capture the dependence between gravity gradient time step distance and large data. On this basis, considering that the number and distribution of outliers in the measured satellite gravity gradient data are unknown, two indicators of success rate and failure rate are introduced to evaluate the effect of outlier detection and verify the effectiveness and accuracy of outlier detection method. The method in this paper realizes the outlier detection ability of long-time series observation data. The calculation results show that after the LSTM training model is applied to the test set, the prediction accuracy reaches 99.4%, and it only takes 4.26 seconds, the processing time is short, without manual intervention. In the prediction process, increasing the training data or increasing the number of LSTM neurons can improve the prediction effect, and the loss function, learning rate, number of iterations, etc. are the main model parameters affecting the prediction effect. The experimental results of outlier recognition show that LSTM model can realize feature extraction and effectively solve the problem of outlier recognition. The complexity of the original time-consuming outlier recognition technology is reduced, and the network can be supplemented with new synthetic data for training to identify new features. It has good adaptability to anomaly removal, and provides a new method to remove all kinds of anomaly interference from the actual observation data of satellite gravity.

Key words: satellite gravity gradiometry, outlier detection, long short-term memory neural network, model parameter

摘要:

粗差探测是卫星重力数据预处理环节的关键步骤。针对海量观测数据如卫星重力梯度数据, 原有的粗差探测方法存在时间消耗长、 准确率较低等不足。文中基于长短时记忆(LSTM)网络方法, 提出了可用于重力梯度数据粗差探测的机器学习方法, 实现了对长时间序列观测数据的粗差识别问题, 避免了粗差对观测数据的影响。计算结果显示, LSTM训练模型的预测精度达99.4%, 在预测过程中, 扩大训练数据量或增加LSTM神经元的个数都可提高预测效果, 且损失函数、 学习率、 迭代次数等是影响预测效果的主要模型参数。训练模型识别粗差实验结果表明: LSTM模型能够很好地应用于卫星重力梯度测量观测数据的粗差探测。

关键词: 卫星重力梯度, 粗差探测, 长短时记忆网络, 模型参数

CLC Number: