[go: up one dir, main page]

CN110197307B - Regional sea surface temperature prediction method combined with attention mechanism - Google Patents

Regional sea surface temperature prediction method combined with attention mechanism Download PDF

Info

Publication number
CN110197307B
CN110197307B CN201910477316.2A CN201910477316A CN110197307B CN 110197307 B CN110197307 B CN 110197307B CN 201910477316 A CN201910477316 A CN 201910477316A CN 110197307 B CN110197307 B CN 110197307B
Authority
CN
China
Prior art keywords
matrix
sst
convlstm
model
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910477316.2A
Other languages
Chinese (zh)
Other versions
CN110197307A (en
Inventor
贺琪
查铖
宋巍
赵丹枫
黄冬梅
胡泽煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Ocean University
Original Assignee
Shanghai Ocean University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Ocean University filed Critical Shanghai Ocean University
Priority to CN201910477316.2A priority Critical patent/CN110197307B/en
Publication of CN110197307A publication Critical patent/CN110197307A/en
Application granted granted Critical
Publication of CN110197307B publication Critical patent/CN110197307B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种结合注意力机制的区域型海表面温度预测方法,步骤包括:1)、将区域内每天的SST数据处理成一个矩阵,依次按时间先后进行排列,构成矩阵序列,作为CA‑ConvLSTM模型的输入;2)、对SST矩阵进行处理,通过卷积层提取各个记录点的分布特征;3)、利用注意力机制为获得的矩阵特征分配注意力权重,然后将注意力权重乘上对应的矩阵特征,得到加权特征;4)、最后,将加权特征作为ConvLSTM模型的输入,利用ConvLSTM训练预测模型,最终获得预测结果。本发明将区域内SST整理为一个矩阵,作为一个整体输入到模型中,便于模型提取SST的时间和空间相关性。

The invention discloses a regional sea surface temperature prediction method combined with an attention mechanism. The steps include: 1), processing the daily SST data in the region into a matrix, and arranging them sequentially in chronological order to form a matrix sequence, as the input of the CA-ConvLSTM model; 2), processing the SST matrix, and extracting the distribution characteristics of each recording point through a convolution layer; 3), using the attention mechanism to assign attention weights to the obtained matrix characteristics, and then multiplying the attention weights by the corresponding matrix characteristics to obtain weighted characteristics; 4), finally, using the weighted characteristics as Con The input of the vLSTM model, use ConvLSTM to train the prediction model, and finally obtain the prediction result. The invention organizes the SST in the region into a matrix, which is input into the model as a whole, so that the model can extract the time and space correlation of the SST.

Description

一种结合注意力机制的区域型海表面温度预测方法A Regional Sea Surface Temperature Prediction Method Combined with Attention Mechanism

技术领域technical field

本发明涉及海表面温度预测领域,特别涉及到一种结合注意力机制的区域型海表面温度预测方法。The invention relates to the field of sea surface temperature prediction, in particular to a regional sea surface temperature prediction method combined with an attention mechanism.

背景技术Background technique

近年来,海表面温度预测在海洋气象、航海、海洋防灾减灾和海洋渔业等各种海洋相关领域引起了越来越多的关注。至今为止,人们已经提出了许多方法来预测海表面温度(Sea Surface Temperature,SST),并取得了良好的效果。这些方法主要可分为三类:统计预报方法、数值预报方法和经验预报方法。随着信息采集技术的不断完善,越来越多的SST数据被收集,存储和处理,研究者还对SST的时空特征进行了分析。SST是典型的时间序列数据,因此许多学者将海表面温度预测(Sea Surface Temperature Prediction,SSTP)视为时间序列回归问题,并尝试使用时间序列预测方法来预测SST,以期望能够获得更好的预测效果。In recent years, sea surface temperature prediction has attracted increasing attention in various ocean-related fields such as marine meteorology, navigation, marine disaster prevention and mitigation, and marine fisheries. So far, many methods have been proposed to predict Sea Surface Temperature (SST), and good results have been achieved. These methods can be mainly divided into three categories: statistical forecasting methods, numerical forecasting methods and empirical forecasting methods. With the continuous improvement of information collection technology, more and more SST data are collected, stored and processed, and the researchers also analyzed the spatiotemporal characteristics of SST. SST is a typical time series data, so many scholars regard Sea Surface Temperature Prediction (SSTP) as a time series regression problem, and try to use time series prediction methods to predict SST, hoping to obtain better prediction results.

SSTP不仅在理论上重要,而且在许多海洋相关领域都有实际应用。海洋表面单个记录点的历史温度数据,是典型的长时间序列数据。因此许多研究者将SSTP视为一个时间序列回归问题,并将许多时间序列预测方法应用在SSTP。传统时间序列预测模型如自回归(Autoregressive,AR)模型、滑动平均(Moving Average,MA)模型以及自回归滑动平均(Autoregressive Moving Average,ARMA)模型等都是线性模型,并不能满足实际应用的需求,因此研究者不断提出了许多非线性预测方法。神经网络就是一种非线性预测方法,具有极强的非线性拟合能力,目前已有多种不同形式的神经网络用于SSTP。Aparna等人提出了一种人工神经网络,利用当前的SST来预测第二天阿拉伯海东北部的SST,预测模型误差≤±0.5°。长短时记忆网络(Long Short Term Memory,LSTM)是循环神经网络(RecurrentNeural Network,RNN)的一种变体,能解决RNN“梯度消失”的问题,并且对时间序列数据有着较强的建模能力,也已经应用于SSTP,取得了显著的效果。除了神经网络之外,支持向量机(Support Vector Machines,SVM)也用于时间序列预测。Lins等人考虑到SST变化的季节性以及季节内变化的规律,从浮标数据中获取SST的曲率和斜率信息,利用SVM来预测SST。SSTP is not only important in theory, but also has practical applications in many ocean-related fields. The historical temperature data of a single record point on the ocean surface is a typical long-term series data. Therefore, many researchers regard SSTP as a time series regression problem and apply many time series forecasting methods to SSTP. Traditional time series forecasting models such as Autoregressive (AR) model, Moving Average (MA) model, and Autoregressive Moving Average (ARMA) model are all linear models, which cannot meet the needs of practical applications. Therefore, researchers have continuously proposed many nonlinear forecasting methods. Neural network is a nonlinear prediction method with strong nonlinear fitting ability. Currently, there are many different forms of neural network used in SSTP. Aparna et al. proposed an artificial neural network to use the current SST to predict the SST in the northeast Arabian Sea for the next day, with a prediction model error of ≤±0.5°. Long Short Term Memory (LSTM) is a variant of Recurrent Neural Network (RNN), which can solve the problem of RNN "gradient disappearance", and has a strong modeling ability for time series data. It has also been applied to SSTP and achieved remarkable results. Besides neural networks, Support Vector Machines (SVM) are also used for time series forecasting. Lins et al. considered the seasonality of SST changes and the regularity of intra-seasonal changes, obtained the curvature and slope information of SST from buoy data, and used SVM to predict SST.

然而,以上方法均只考虑了SST的时间相关性,而没有考虑SST之间的空间相关性,因此在预测SST时使得大量的信息会丢失;其次,在预测SST时,未体现出历史SST对要预测的SST的不同影响,使得模型不够全面,最终预测精度不高。However, the above methods only consider the temporal correlation of SST, but do not consider the spatial correlation between SSTs, so a large amount of information will be lost when predicting SST; secondly, when predicting SST, the different influences of historical SST on the SST to be predicted are not reflected, making the model not comprehensive enough, and the final prediction accuracy is not high.

发明内容Contents of the invention

本发明的目的在于针对现有技术中的不足,提供一种结合注意力机制的区域型海表面温度预测方法,以解决上述问题。The object of the present invention is to address the deficiencies in the prior art and provide a regional sea surface temperature prediction method combined with an attention mechanism to solve the above problems.

本发明所解决的技术问题可以采用以下技术方案来实现:The technical problem solved by the present invention can adopt following technical scheme to realize:

一种结合注意力机制的区域型海表面温度预测方法,包括如下步骤:A regional sea surface temperature prediction method combined with an attention mechanism, comprising the following steps:

1)、将区域内每天的SST数据处理成一个矩阵,依次按时间先后进行排列,构成矩阵序列,作为CA-ConvLSTM模型的输入;1) Process the daily SST data in the area into a matrix, and arrange them sequentially according to time to form a matrix sequence, which is used as the input of the CA-ConvLSTM model;

2)、对SST矩阵进行处理,通过卷积层提取各个记录点的分布特征;2), process the SST matrix, and extract the distribution characteristics of each recording point through the convolution layer;

3)、利用注意力机制为获得的矩阵特征分配注意力权重,然后将注意力权重乘上对应的矩阵特征,得到加权特征;3) Use the attention mechanism to assign attention weights to the obtained matrix features, and then multiply the attention weights by the corresponding matrix features to obtain weighted features;

4)、最后,将加权特征作为ConvLSTM模型的输入,利用ConvLSTM训练预测模型,最终获得预测结果。4) Finally, the weighted feature is used as the input of the ConvLSTM model, and the prediction model is trained using ConvLSTM, and finally the prediction result is obtained.

3.根据权利要求1所述的结合注意力机制的区域型海表面温度预测方法,其特征在于:所述SST矩阵卷积是将SST数据整理为长度为|F|的数字矩阵序列后,利用卷积对矩阵进行特征提取以用于获取局部特征;卷积操作通过卷积核的移动来完成的,输出矩阵的每一个值就是输入矩阵中每个3×3区域的值与3×3卷积核对应位置的值乘积的和。3. The regional sea surface temperature prediction method combined with attention mechanism according to claim 1, characterized in that: the SST matrix convolution is after the SST data is sorted into a digital matrix sequence with a length of |F|, and the convolution is used to extract the features of the matrix to obtain local features; the convolution operation is completed by the movement of the convolution kernel, and each value of the output matrix is the sum of the product of the value of each 3 * 3 area in the input matrix and the value of the corresponding position of the 3 * 3 convolution kernel.

进一步的,所述区域内每天的SST数据处理成一个W·H矩阵,区域内的SST序列F=F1,F2,…,F|F|,|F|表示SST序列的时间长度,Fi=W·H为该区域内第i(1≤i≤|F|,i∈Z)天的所有记录点的SST,即一个W·H的矩阵,这些矩阵构成的序列作为CA-ConvLSTM模型的输入。Further, the daily SST data in the region is processed into a W·H matrix, the SST sequence F=F 1 , F 2 , ..., F |F|

进一步的,通过卷积操作得到矩阵特征序列,在预测中,使用k天的SST来预测未来一天或五天的SST,k的取值不同;从第n天开始,这k天的矩阵特征序列为M=Mn+1,Mn+2,…,Mn+k-1,Mn+k,Mi为第i(n≤i≤n+k,i∈Z)个矩阵特征,即(W-2)·(H-2)的矩阵。Further, the matrix feature sequence is obtained through the convolution operation. In the prediction, the SST of k days is used to predict the SST of one day or five days in the future, and the value of k is different; starting from the nth day, the matrix feature sequence of this k day is M=M n+1 , M n+2 ,..., M n+k-1 , M n+k , and Mi is the ith (n≤i≤n+k, i∈Z) matrix feature, that is, the matrix of (W-2) (H-2).

进一步的,所述CA-ConvLSTM模型自动学习注意力权重的方法如下:Further, the method for automatically learning attention weights of the CA-ConvLSTM model is as follows:

首先通过池化层得到每个矩阵特征的均值,即V=[vn+1,vn+2,…,vn+k-1,vn+k],此时vi是Mi矩阵中所有数值的平均值;注意力模型Φ的定义如下:First, the mean value of each matrix feature is obtained through the pooling layer, that is, V=[v n+1 , v n+2 , ..., v n+k-1 , v n+k ], at this time v i is the average value of all values in the Mi matrix; the definition of the attention model Φ is as follows:

D=tanh(w*V+b) (1)D=tanh(w*V+b) (1)

A=softmax(W’*D+b’) (2)A=softmax(W'*D+b') (2)

其中w,b是全连接层的权重和偏置,D是全连接层的输出以及softmax的输入,W’和b’是softmax的权重和偏置,A是softmax的输出,即注意力权重,是一个长度为k的向量,表示为A=[an+1,an+2,…,an+k-1,an+k,];之后,依次将注意力权重与对应矩阵特征相乘可得到一个新的特征序列,即加权特征序列,表示为N=Nn+1,Nn+2,…,Nn+k-1,Nn+k,Ni为第i个加权特征,Ni=ai*Mi,为一个(W-2)·(H-2)的矩阵;Where w, b are the weight and bias of the fully connected layer, D is the output of the fully connected layer and the input of softmax, W' and b' are the weight and bias of softmax, A is the output of softmax, that is, the attention weight, which is a vector of length k, expressed as A=[an+1, an+2,...,an+k-1, an+k,]; After that, the attention weights are multiplied by the corresponding matrix features in turn to obtain a new feature sequence, that is, the weighted feature sequence, expressed as N=Nn+1, Nn+2,...,Nn+k-1, Nn+k, Niis the i-th weighted feature, Ni=ai*Mi, is a matrix of (W-2)·(H-2);

在矩阵特征序列M进入ConvLSTM模型前,通过注意力机制自动给矩阵特征序列M进行加权,获得加权特征序列N。Before the matrix feature sequence M enters the ConvLSTM model, the matrix feature sequence M is automatically weighted through the attention mechanism to obtain the weighted feature sequence N.

进一步的,所述ConvLSTM的计算公式如下:Further, the calculation formula of the ConvLSTM is as follows:

其中*代表卷积操作,○代表对应相乘,σ是sigmoid激活函数。Among them, * represents the convolution operation, ○ represents the corresponding multiplication, and σ is the sigmoid activation function.

与现有技术相比,本发明的有益效果如下:Compared with the prior art, the beneficial effects of the present invention are as follows:

本发明将区域内SST整理为一个矩阵,作为一个整体输入到模型中,便于模型提取SST的时间和空间相关性。利用注意力机制为特征分配权重,体现历史SST对要预测的SST时间维度上不同程度的影响,使得关键信息被突出,无关信息被忽略,有利于提高SSTP精度。利用ConvLSTM对时空信息的强建模能力,对SST的时空信息进行建模,在预测中,不仅考虑了SST的时间相关性,还考虑了空间相关性,使得模型包含的信息更加全面。The invention organizes the SST in the region into a matrix, and inputs it into the model as a whole, so that the model can extract the time and space correlation of the SST. The attention mechanism is used to assign weights to features, reflecting the different degrees of influence of historical SST on the time dimension of SST to be predicted, so that key information is highlighted and irrelevant information is ignored, which is conducive to improving the accuracy of SSTP. Using the strong modeling ability of ConvLSTM for spatiotemporal information, the spatiotemporal information of SST is modeled. In the prediction, not only the temporal correlation of SST, but also the spatial correlation is considered, so that the information contained in the model is more comprehensive.

CA-ConvLSTM模型具有一定的实用性,它结合了时间和空间信息,提高了SSTP的准确性。对比不同方法在同一数据集下的预测结果,分别预测一天和五天内的SST,CA-ConvLSTM的预测一天内SST日平均偏差约0.18°,预测五天内SST日平均偏差约0.33°,实验结果表明CA-ConvLSTM在SSTP方面获得了最好的效果,从而验证了该方法的有效性。另一方面,该方法可以直接进行区域型的SSTP,不需要对每一个站点的SST进行建模,极大的提高了预报的效率。The CA-ConvLSTM model has some practicality, and it combines temporal and spatial information to improve the accuracy of SSTP. Comparing the prediction results of different methods in the same data set, respectively predicting the SST within one day and within five days, the daily average deviation of SST within one day of CA-ConvLSTM prediction is about 0.18°, and the average daily deviation of SST within five days is about 0.33°. The experimental results show that CA-ConvLSTM has achieved the best results in SSTP, thus verifying the effectiveness of the method. On the other hand, this method can directly perform regional SSTP without modeling the SST of each station, which greatly improves the efficiency of forecasting.

附图说明Description of drawings

图1为本发明所述的CA-ConvLSTM流程图。Fig. 1 is a flow chart of CA-ConvLSTM according to the present invention.

图2为本发明所述的注意力机制学习过程示意图。Fig. 2 is a schematic diagram of the learning process of the attention mechanism of the present invention.

图3为本发明所述的ConvLSTM的工作原理示意图。Fig. 3 is a schematic diagram of the working principle of the ConvLSTM described in the present invention.

具体实施方式Detailed ways

为使本发明实现的技术手段、创作特征、达成目的与功效易于明白了解,下面结合具体实施方式,进一步阐述本发明。In order to make the technical means, creative features, goals and effects achieved by the present invention easy to understand, the present invention will be further described below in conjunction with specific embodiments.

参见图1~图3,本发明所述的一种结合注意力机制的区域型海表面温度预测方法。区域内每天的SST数据处理成一个W·H矩阵,区域内的SST序列F=F1,F2,…,F|F|,|F|表示SST序列的时间长度,Fi=W·H为该区域内第i(1≤i≤|F|,i∈Z)天的所有记录点的SST,即一个W·H的矩阵,这些矩阵构成的序列作为CA-ConvLSTM模型的输入。为充分考虑SST在时间和空间的相关性,以及历史SST对要预测的SST在时间维度上不同程度的影响,首次提出了CA-ConvLSTM模型,并成功应用于SSTP,获得了很好的效果。Referring to Fig. 1 to Fig. 3, a regional sea surface temperature prediction method combined with an attention mechanism according to the present invention. The daily SST data in the region is processed into a W·H matrix. The SST sequence F=F 1 , F 2 ,..., F |F| In order to fully consider the correlation of SST in time and space, and the influence of historical SST on the time dimension of SST to be predicted, the CA-ConvLSTM model was proposed for the first time, and it was successfully applied to SSTP and achieved good results.

图1为CA-ConvLSTM模型的流程图,|F|代表的是SST矩阵的时间长度,W代表区域的宽度,H代表区域的高度,k代表用于预测的历史SST的天数。该方法包括如下步骤:Figure 1 is a flowchart of the CA-ConvLSTM model, where |F| represents the time length of the SST matrix, W represents the width of the region, H represents the height of the region, and k represents the number of days of historical SST used for prediction. The method comprises the steps of:

1)、将区域内每天的SST数据处理成一个矩阵,依次按时间先后进行排列,构成矩阵序列,作为CA-ConvLSTM模型的输入。1) Process the daily SST data in the area into a matrix, and arrange them in chronological order to form a matrix sequence, which is used as the input of the CA-ConvLSTM model.

2)、对SST矩阵进行处理,通过卷积层提取各个记录点的分布特征。2) Process the SST matrix, and extract the distribution features of each recording point through the convolutional layer.

3)、利用注意力机制为获得的矩阵特征分配注意力权重,然后将注意力权重乘上对应的矩阵特征,得到加权特征。3) Use the attention mechanism to assign attention weights to the obtained matrix features, and then multiply the attention weights by the corresponding matrix features to obtain weighted features.

4)、最后,将加权特征作为ConvLSTM模型的输入,利用ConvLSTM训练预测模型,最终获得预测结果。4) Finally, the weighted feature is used as the input of the ConvLSTM model, and the prediction model is trained using ConvLSTM, and finally the prediction result is obtained.

在上述步骤中,注意力权重和ConvLSTM是CA-ConvLSTM预测SST的两个关键组成部分,注意力权重的合理分配以及ConvLSTM的较好训练是确保CA-ConvLSTM能够获取优良性能的关键。In the above steps, attention weight and ConvLSTM are the two key components of CA-ConvLSTM to predict SST. Reasonable distribution of attention weight and better training of ConvLSTM are the key to ensure that CA-ConvLSTM can obtain excellent performance.

SST矩阵卷积:SST matrix convolution:

将SST数据整理为长度为|F|的数字矩阵序列后,利用卷积对矩阵进行特征提取以用于获取局部特征。卷积操作主要是通过卷积核的移动来完成的,输出矩阵的每一个值就是输入矩阵中每个3×3区域的值与3×3卷积核对应位置的值乘积的和。表1是卷积层相关参数的信息表,如下所示:After sorting the SST data into a sequence of digital matrices with length |F|, convolution is used to perform feature extraction on the matrix to obtain local features. The convolution operation is mainly done by moving the convolution kernel. Each value of the output matrix is the sum of the product of the value of each 3×3 area in the input matrix and the value of the corresponding position of the 3×3 convolution kernel. Table 1 is an information table of parameters related to the convolutional layer, as follows:

表1:卷积层相关参数信息表Table 1: Convolutional layer related parameter information table

输入是一个W·H的矩阵,卷积核移动步长为1×1,那么卷积核就移动了(W-2)·(H-2)次,输出就是(W-2)·(H-2)的矩阵。这样不仅保留SST的空间信息,还提取了局部特征。The input is a W·H matrix, and the convolution kernel moves with a step size of 1×1, then the convolution kernel moves (W-2)·(H-2) times, and the output is a matrix of (W-2)·(H-2). This not only preserves the spatial information of SST, but also extracts local features.

注意力机制:Attention Mechanism:

通过卷积操作得到矩阵特征序列,在预测中,使用k天的SST来预测未来一天或五天的SST,k的取值都是不同的。从第n天开始,这k天的矩阵特征序列可以表示为M=Mn+1,Mn+2,…,Mn+k-1,Mn+k,Mi为第i(n≤i≤n+k,i∈Z)个矩阵特征,即(W-2)·(H-2)的矩阵。The matrix feature sequence is obtained through the convolution operation. In the prediction, the SST of k days is used to predict the SST of one day or five days in the future, and the value of k is different. Starting from the nth day, the matrix feature sequence of the k day can be expressed as M=M n+1 , M n+2 , ..., M n+k-1 , M n+k , and M i is the ith (n≤i≤n+k, i∈Z) matrix feature, that is, the matrix of (W-2)·(H-2).

图2是注意力学习过程示意图。为了让模型能够自动学习出注意力权重,首先通过池化层得到每个矩阵特征的均值,即V=[vn+1,vn+2,…,vn+k-1,vn+k],此时vi是Mi矩阵中所有数值的平均值;注意力模型Φ的定义如下:Figure 2 is a schematic diagram of the attention learning process. In order to allow the model to automatically learn the attention weight, first obtain the mean value of each matrix feature through the pooling layer, that is, V=[v n+1 , v n+2 ,...,v n+k-1 , v n+k ], at this time v i is the average value of all values in the Mi matrix; the definition of the attention model Φ is as follows:

D=tanh(w*V+b) (1)D=tanh(w*V+b) (1)

A=softmax(W’*D+b’) (2)A=softmax(W'*D+b') (2)

其中w,b是全连接层的权重和偏置,D是全连接层的输出以及softmax的输入,W’和b’是softmax的权重和偏置,A是softmax的输出,即注意力权重,是一个长度为k的向量,表示为A=[an+1,an+2,…,an+k-1,an+k,];之后,依次将注意力权重与对应矩阵特征相乘可得到一个新的特征序列,即加权特征序列,可以表示为N=Nn+1,Nn+2,…,Nn+k-1,Nn+k,Ni为第i个加权特征,Ni=ai*Mi,为一个(W-2)·(H-2)的矩阵。Where w, b are the weight and bias of the fully connected layer, D is the output of the fully connected layer and the input of softmax, W' and b' are the weight and bias of softmax, A is the output of softmax, that is, the attention weight, which is a vector of length k, expressed as A=[an+1, an+2,...,an+k-1, an+k,]; After that, multiply the attention weight and the corresponding matrix features in turn to get a new feature sequence, that is, the weighted feature sequence, which can be expressed as N=Nn+1, Nn+2,...,Nn+k-1, Nn+k, Niis the i-th weighted feature, Ni=ai*Mi, is a (W-2)·(H-2) matrix.

在矩阵特征序列M进入ConvLSTM模型前,通过注意力机制自动给矩阵特征序列M进行加权,获得加权特征序列N,使得特征序列中重要的信息能够分配较大的注意力权重,无关或者不重要的信息便被分配较小的注意力权重,能够为后续的预测提供更全面的信息,那么预测结果也更加准确。Before the matrix feature sequence M enters the ConvLSTM model, the matrix feature sequence M is automatically weighted through the attention mechanism to obtain the weighted feature sequence N, so that important information in the feature sequence can be assigned a larger attention weight, and irrelevant or unimportant information will be assigned a smaller attention weight, which can provide more comprehensive information for subsequent predictions, and the prediction results will be more accurate.

ConvLSTM:ConvLSTM:

LSTM对于时间序列数据具有极强的建模能力,但是如果这些数据是矩阵,这些数据具有丰富的空间信息,并且具有很强的局部特征,直接使用LSTM来预测SST,不仅没考虑到空间相关性,还无法提取局部特征,会使SSTP受到限制,预测精度就不会很高。卷积神经网络(Convolutional Neural Networks,CNN)有着强大的特征提取能力,那么在LSTM的基础上增加卷积操作,便可以提升模型的特征提取能力,ConvLSTM就是受到该启发才被提出。它不仅可以表达出数据的空间相关性,还可以体现出时间相关性。考虑到SST数据在预测时同时具有时间和空间相关性,因此用ConvLSTM对SST的时间和空间相关性进行建模在理论上是可行的。ConvLSTM的工作原理如图3所示。LSTM has a strong modeling ability for time series data, but if these data are matrices, these data have rich spatial information and strong local features, directly using LSTM to predict SST, not only does not consider spatial correlation, but also cannot extract local features, which will limit SSTP and the prediction accuracy will not be very high. Convolutional Neural Networks (CNN) have powerful feature extraction capabilities, so adding convolution operations on the basis of LSTM can improve the feature extraction capabilities of the model. ConvLSTM was inspired by this. It can not only express the spatial correlation of data, but also reflect the temporal correlation. Considering that SST data has both temporal and spatial correlations in prediction, it is theoretically feasible to model the temporal and spatial correlations of SST with ConvLSTM. The working principle of ConvLSTM is shown in Figure 3.

ConvLSTM的计算公式如下:The calculation formula of ConvLSTM is as follows:

其中*代表卷积操作,○代表对应相乘,σ是sigmoid激活函数。该模型在获取时空关系上具有很强的能力,将之前得到的加权特征序列作为ConvLSTM模型的输入。通过该方法来预测SST,既考虑SST的时间相关性,还考虑了SST的空间相关性,在实际预测中,预测精度会得到一定的提升。在实际预测中,利用CA-ConvLSTM在预测一天的SST时,得到的预测结果是一个SST矩阵;在预测五天的SST时,得到的预测结果是五个SST矩阵,即连续未来五天内的SST。Among them, * represents the convolution operation, ○ represents the corresponding multiplication, and σ is the sigmoid activation function. The model has a strong ability to obtain the spatio-temporal relationship, and the previously obtained weighted feature sequence is used as the input of the ConvLSTM model. Using this method to predict SST, not only the time correlation of SST, but also the spatial correlation of SST is considered. In the actual prediction, the prediction accuracy will be improved to a certain extent. In the actual prediction, when using CA-ConvLSTM to predict the SST of one day, the prediction result is an SST matrix; when predicting the SST of five days, the prediction result is five SST matrices, that is, the SST in the next five consecutive days.

本发明在实际实施时的方法如下:The method of the present invention when actual implementation is as follows:

实验环境与数据:Experimental environment and data:

为验证CA-ConvLSTM在SSTP的性能,采用Keras框架搭建模型,训练时采用Adam优化器,初始学习率为0.001。实验环境是Windows10,Intel Core i5,2.6GHz,8G RAM,算法实现使用python3。In order to verify the performance of CA-ConvLSTM in SSTP, the Keras framework was used to build the model, the Adam optimizer was used for training, and the initial learning rate was 0.001. The experimental environment is Windows10, Intel Core i5, 2.6GHz, 8G RAM, and the algorithm is implemented using python3.

实验使用的数据是一段时间内的SST数据,数据采集于30N、130E附近,数据按行列排80*40,其中包含无效值,用-999.0表示。The data used in the experiment is SST data for a period of time. The data is collected near 30N and 130E. The data is arranged in rows and columns of 80*40, including invalid values, represented by -999.0.

实验性能指标:Experimental performance indicators:

为验证CA-ConvLSTM模型的有效性,实验使用均方根误差(Root Mean SquareError,RMSE)和预测精度(Prediction Accuracy,PACC)来评估不同预测方法的性能,RMSE和PACC的公式如下表示:In order to verify the effectiveness of the CA-ConvLSTM model, the experiment uses Root Mean Square Error (Root Mean Square Error, RMSE) and prediction accuracy (Prediction Accuracy, PACC) to evaluate the performance of different prediction methods. The formulas of RMSE and PACC are expressed as follows:

其中Y_real是真实值,Y_pred是预测值。i是表示按行列顺序排列的第i个值,n表示要预测的区域宽和高的乘积。Where Y_real is the real value and Y_pred is the predicted value. i represents the i-th value arranged in row and column order, and n represents the product of the width and height of the area to be predicted.

模型在预测时,对于RMSE指标来说,RMSE越小,模型性能越好,而PACC相反,PACC越大,模型性能越好。那么在模型性能最佳时,需要满足RMSE最小,PACC最大。在训练模型时,便可以通过比较RMSE和PACC的值来确定合适的模型结构和模型参数。When the model is predicting, for the RMSE index, the smaller the RMSE, the better the model performance, and the opposite of PACC, the larger the PACC, the better the model performance. Then when the model performance is the best, it is necessary to satisfy the minimum RMSE and the maximum PACC. When training the model, the appropriate model structure and model parameters can be determined by comparing the values of RMSE and PACC.

实验结果分析:Analysis of results:

由于所使用SST数据存在缺失,即部分SST记录点在4749天内的SST数据均为无效值,因此需要对SST数据进行预处理。为了保证SST数据的完整性,将80*40的SST数据裁剪为30*30,这样便去除掉了无效值,那么裁剪后30*30的SST数据都是有效值,不存在无效值。然后,将4749天的SST数据进行划分,将75%的数据作为训练集,用于训练CA-ConvLSTM预测模型的参数,将余下25%的数据作为验证集,用于验证模型的学习效果。Due to the lack of SST data used, that is, the SST data of some SST record points within 4749 days are all invalid values, so the SST data needs to be preprocessed. In order to ensure the integrity of the SST data, the 80*40 SST data is cut to 30*30, which removes the invalid value, then the 30*30 SST data after cutting are all valid values, and there are no invalid values. Then, 4749 days of SST data are divided, 75% of the data is used as a training set to train the parameters of the CA-ConvLSTM prediction model, and the remaining 25% of the data is used as a validation set to verify the learning effect of the model.

确定模型结构:卷积层的作用是为了进行局部特征提取,而ConvLSTM自身已经具备这样的能力,那么在A-ConvLSTM前需不需要加上卷积层便成为一个疑问。为了确定模型结构,将30*30的SST数据作为数据集,分别利用七天和十五天的SST来预测一天和五天内的SST,对比A-ConvLSTM和CA-ConvLSTM的预测结果,其中A-ConvLSTM没有加卷积层,CA-ConvLSTM加了卷积层。Determine the model structure: the role of the convolutional layer is to extract local features, and ConvLSTM itself already has such capabilities, so it becomes a question whether it is necessary to add a convolutional layer before A-ConvLSTM. In order to determine the model structure, the 30*30 SST data is used as a data set, and the SST of seven days and fifteen days is used to predict the SST of one day and five days, respectively, and the prediction results of A-ConvLSTM and CA-ConvLSTM are compared. A-ConvLSTM does not add a convolutional layer, and CA-ConvLSTM adds a convolutional layer.

表2:A-ConvLSTM和CA-ConvLSTM性能对比Table 2: Performance comparison of A-ConvLSTM and CA-ConvLSTM

表2是在参数相同、数据集相同的情况下,A-ConvLSTM和CA-ConvLSTM分别预测一天和五天内SST的结果。对于PACC指标,CA-ConvLSTM预测一天和五天的精度分别是99.33%和98.78%,而A-ConvLSTM的PACC指标分别是99.16%和98.74%,要低于CA-ConvLSTM;对于RMSE指标,CA-ConvLSTM预测一天和五天的误差分别是0.2347和0.4312,而A-ConvLSTM的RMSE指标分别是0.3273和0.4470,高于CA-ConvLSTM。通过实验对比发现,不管是在RMSE指标还是PACC指标下,CA-ConvLSTM均要优于A-ConvLSTM,实验结果表明增加卷积层后对提高SSTP精度具有一定的作用,因此模型的结构确定为CA-ConvLSTM。这种情况发生的主要原因是通过ConvLSTM自身的卷积操作对SST数据提取的局部特征还不够明显,而在这之前增加一个卷积层,提高了模型的特征提取能力,使数据在ConvLSTM模型中特征表现的更加明显,有利于提高SSTP的精度。Table 2 shows the results of A-ConvLSTM and CA-ConvLSTM predicting SST within one day and five days under the same parameters and the same data set. For the PACC index, the accuracy of CA-ConvLSTM prediction for one day and five days is 99.33% and 98.78%, respectively, while the PACC index of A-ConvLSTM is 99.16% and 98.74%, which is lower than CA-ConvLSTM; for the RMSE index, the errors of CA-ConvLSTM prediction for one day and five days are 0.2347 and 0.4312, respectively, while the RMSE index of A-ConvLSTM is 0.327 3 and 0.4470, higher than CA-ConvLSTM. Through experimental comparison, it is found that CA-ConvLSTM is better than A-ConvLSTM no matter in RMSE index or PACC index. The experimental results show that adding convolutional layers has a certain effect on improving the accuracy of SSTP, so the structure of the model is determined to be CA-ConvLSTM. The main reason for this situation is that the local features extracted from the SST data are not obvious enough through the convolution operation of ConvLSTM itself, and adding a convolution layer before that improves the feature extraction ability of the model, making the data more obvious in the ConvLSTM model, which is conducive to improving the accuracy of SSTP.

确定输入尺寸:利用CA-ConvLSTM预测SST时,数据集区域大小为30*30,考虑到输入尺寸大小对SSTP精度有影响,因此从30*30的SST中裁去边缘部分,得到正中心的20*20的SST,再裁剪边缘部分得到正中心的10*10的SST,最终获得尺寸大小不同的SST数据集。假如输入尺寸过小,SST数据包含的信息量会比较少,那么将不利于预测;假如输入的尺寸太大,预测时会受到其他信息的干扰,使得预测精度不高。因此将三种尺寸大小不同的数据集作为CA-ConvLSTM模型的输入,通过对比评价指标来选择合适的尺寸作为模型的输入。考虑到预测一天和五天的SST时,所需的信息量不同,那么不同尺寸大小的输入会产生不同的效果,为此仍然利用七天和十五天的SST分别来预测一天和五天内的SST,对比RMSE和PACC指标,分别确定预测一天和五天时模型输入的合适尺寸。由于数据集最大的尺寸是30*30,在实验中就不讨论比30*30大的尺寸。Determine the input size: When using CA-ConvLSTM to predict SST, the size of the data set area is 30*30. Considering that the input size has an impact on the accuracy of SSTP, the edge part is cut from the 30*30 SST to get the 20*20 SST in the center, and then the edge part is cut to get the 10*10 SST in the center, and finally the SST datasets with different sizes are obtained. If the input size is too small, the amount of information contained in the SST data will be relatively small, which is not conducive to prediction; if the input size is too large, the prediction will be interfered by other information, making the prediction accuracy not high. Therefore, three data sets with different sizes are used as the input of the CA-ConvLSTM model, and the appropriate size is selected as the input of the model by comparing the evaluation indicators. Considering that when predicting one-day and five-day SST, the amount of information required is different, and inputs of different sizes will have different effects. For this reason, we still use seven-day and fifteen-day SST to predict one-day and five-day SST, respectively, and compare the RMSE and PACC indicators to determine the appropriate size of the model input when predicting one day and five days, respectively. Since the maximum size of the data set is 30*30, the size larger than 30*30 is not discussed in the experiment.

表3:不同输入尺寸下CA-ConvLSTM预测一天和五天内SST的评价指标对比Table 3: Comparison of evaluation indicators of CA-ConvLSTM predicting SST within one day and five days under different input sizes

表3中反映的是不同输入尺寸下CA-ConvLSTM分别预测一天和五天内SST的评价指标对比,其中k代表的是用于预测的历史SST的天数。在预测一天内的SST时,以10*10、20*20作为输入时,RMSE指标分别为0.3688、0.2426,均高于30*30作为输入时的RMSE指标;10*10、20*20的PACC指标分别为98.78%、99.27%,都比30*30的PACC指标要低,通过以上对比,可以确定在预测一天的SST时,输入尺寸为30*30能够获得较好的效果。在预测五天内的SST时,对比不同尺寸下CA-ConvLSTM模型预测的RMSE指标和PACC指标,可以发现输入尺寸为30*30时,预测效果是最好。在SSTP时,需要足够的信息支撑才能够有效提高SST区域预测的精度,而10*10和20*20的区域包含的信息量较少,因此在训练模型时,数据量偏少,模型训练结果便不会很好。或许再加大输入尺寸能够获得更好的效果,但是由于没有足够大的区域SST数据,在此便不过多讨论。综上所述,确定模型的输入尺寸为30*30,那么在后续的实验中均会采用该尺寸的SST数据作为模型的输入。Table 3 reflects the comparison of evaluation indicators of CA-ConvLSTM predicting SST within one day and within five days under different input sizes, where k represents the number of days of historical SST used for prediction. When predicting the SST within a day, when 10*10 and 20*20 are used as input, the RMSE indicators are 0.3688 and 0.2426 respectively, both of which are higher than the RMSE indicators when 30*30 is used as the input; the PACC indicators of 10*10 and 20*20 are 98.78% and 99.27%, respectively, which are lower than the PACC indicators of 30*30. Through the above comparison, it can be determined that when predicting the SST of a day, the input size is 3 0*30 can get better results. When predicting the SST within five days, comparing the RMSE index and PACC index predicted by the CA-ConvLSTM model under different sizes, it can be found that the prediction effect is the best when the input size is 30*30. In SSTP, sufficient information support is needed to effectively improve the accuracy of SST area prediction, and the 10*10 and 20*20 areas contain less information, so when training the model, the amount of data is too small, and the model training results will not be very good. Perhaps increasing the input size can achieve better results, but since there is not enough SST data in the area, we will not discuss it here. To sum up, if the input size of the model is determined to be 30*30, then the SST data of this size will be used as the input of the model in subsequent experiments.

确定k值:在先前确定模型结构和确定输入尺寸的实验中,根据专家经验知识,用七天和十五天的SST分别来预测一天和五天内的SST。考虑到k值的大小对SSTP精度有或大或小的影响,使用七天和十五天的SST分别预测一天和五天内的SST时,可能取得的效果并不是最好的,因此分别取k=4,7,15来预测一天内的SST;取k=10,15,25来预测五天内的SST,通过实验对比RMSE和PACC指标,确定合适的k值用于预测一天和五天内的SST。Determining the k value: In the previous experiment of determining the model structure and determining the input size, according to the expert experience knowledge, the SST of seven days and fifteen days were used to predict the SST of one day and five days respectively. Considering that the value of k has a greater or lesser impact on the accuracy of SSTP, when using seven-day and fifteen-day SST to predict one-day and five-day SST respectively, the results may not be the best. Therefore, k=4, 7, and 15 are respectively used to predict SST within one day; k=10, 15, and 25 are used to predict SST within five days. By comparing RMSE and PACC indicators through experiments, the appropriate k value is determined for predicting SST within one day and five days.

表4:不同k值下CA-ConvLSTM预测一天和五天内SST的评价指标对比Table 4: Comparison of evaluation indicators of CA-ConvLSTM predicting SST within one day and five days under different k values

表4反映的是不同k值下CA-ConvLSTM预测一天和五天内SST的评价指标对比。输入尺寸大小象征的是空间维度的信息,k值象征的是时间维度的信息,两者的取值对模型的性能均有影响。在上面的实验中已经确定了输入尺寸,那么接下来就是要确定k值的大小。在预测一天时,对比不同k值下的RMSE和PACC指标,k=7时RMSE为0.2347,PACC为99.33%,优于k=4,15时的评价指标,因此使用k=7来预测一天内的SST。在预测五天时,k=15的PACC指标均略高于k=10和25时的PACC指标,并且k=15的RMSE指标也要优于k=10和25,所以在预测五天内的SST时,取k=15。由此可知,在预测SST时,空间和时间维度的信息量应该适中,过多或过少都会影响到模型的性能,因此才需要进行以上实验。综上所述,本发明取k=7预测一天内的SST,k=15预测五天内的SST。Table 4 reflects the comparison of evaluation indicators of CA-ConvLSTM predicting SST within one day and within five days under different k values. The input size represents the information of the spatial dimension, and the value of k represents the information of the time dimension. The values of both have an impact on the performance of the model. The input size has been determined in the above experiment, so the next step is to determine the size of the k value. When predicting a day, comparing the RMSE and PACC indicators under different k values, when k=7, RMSE is 0.2347, and PACC is 99.33%, which is better than the evaluation indicators when k=4, 15, so k=7 is used to predict SST within a day. When predicting five days, the PACC index of k=15 is slightly higher than the PACC index of k=10 and 25, and the RMSE index of k=15 is also better than k=10 and 25, so when predicting the SST within five days, take k=15. It can be seen that when predicting SST, the amount of information in the spatial and temporal dimensions should be moderate, too much or too little will affect the performance of the model, so the above experiments are needed. In summary, the present invention takes k=7 to predict the SST within one day, and k=15 to predict the SST within five days.

预测方法对比:确定了模型结构、输入尺寸和k值,预测模型基本就已经确定,为了验证CA-ConvLSTM模型的有效性,分别与当前先进的SSTP方法进行性能对比,这些方法包括:SVR、LSTM和ConvLSTM。其中ConvLSTM是可以进行区域SSTP,对ConvLSTM也进行上述相同的实验,确定ConvLSTM合适的输入尺寸为30*30,利用七天和十五天的SST来预测未来一天内和五天内的SST。由于SVR和LSTM都是单点预测,只能针对每个记录点单独建立模型,因此实验时需要对30*30区域的900个记录点的SST都进行预测。其中SVR使用的径向基核函数(Radial Basis Function,RBF),该核函数能够实现非线性映射且参数少。对于SVR和LSTM,依然使用七天和十五天的SST来预测一天和五天内的SST。通过对比不同预测方法在预测SST时的RMSE和PACC指标,比较CA-ConvLSTM与SVR、LSTM和ConvLSTM模型预测性能的差异。Comparison of prediction methods: After determining the model structure, input size and k value, the prediction model has basically been determined. In order to verify the effectiveness of the CA-ConvLSTM model, the performance of the CA-ConvLSTM model is compared with the current advanced SSTP methods. These methods include: SVR, LSTM and ConvLSTM. Among them, ConvLSTM can perform regional SSTP. The same experiment as above is also carried out on ConvLSTM. It is determined that the suitable input size of ConvLSTM is 30*30, and the SST of seven days and fifteen days is used to predict the SST within one day and five days in the future. Since both SVR and LSTM are single-point predictions, they can only build a model for each recording point separately, so the SST of 900 recording points in the 30*30 area needs to be predicted in the experiment. Among them, the Radial Basis Function (RBF) used by SVR can realize nonlinear mapping and has few parameters. For SVR and LSTM, the seven-day and fifteen-day SST are still used to predict the SST within one day and five days. By comparing the RMSE and PACC indicators of different prediction methods when predicting SST, the difference in prediction performance between CA-ConvLSTM and SVR, LSTM and ConvLSTM models is compared.

表5:不同预测方法的性能对比Table 5: Performance comparison of different prediction methods

表5是CA-ConvLSTM与SVR、LSTM和ConvLSTM模型评价指标的对比。在预测一天时,SVR、LSTM的PACC指标为98.92%,98.86%,RMSE指标为0.3908,0.4655;ConvLSTM的PACC指标为99.04%,RMSE指标为0.3602,两个指标均优于SVR和LSTM,说明ConvLSTM要优于SVR和LSTM,验证了空间相关性对SSTP的重要性。相比于CA-ConvLSTM,ConvLSTM的评价指标都要差,实验结果表明在预测一天时,与其他三种方法比较,CA-ConvLSTM模型具有最好的性能。在预测五天时,与SVR、LSTM和ConvLSTM相比,CA-ConvLSTM同样具备最好的性能。由于CA-ConvLSTM不仅考虑SST的时间和空间相关性,还将历史SST对要预测SST的不同影响通过分配不同权重的形式表现出来,使得模型更贴近现实,包含信息更加全面,最终提高了SSTP精度。综上所述,与SVR、LSTM和ConvLSTM相比,CA-ConvLSTM在预测一天和五天内的SST均具备最好的性能,验证了该方法的有效性。Table 5 is a comparison of the evaluation indicators of CA-ConvLSTM and SVR, LSTM and ConvLSTM models. When predicting one day, the PACC indicators of SVR and LSTM are 98.92%, 98.86%, and the RMSE indicators are 0.3908 and 0.4655; the PACC indicators of ConvLSTM are 99.04%, and the RMSE indicators are 0.3602. Both indicators are better than SVR and LSTM, indicating that ConvLSTM is better than SVR and LSTM, which verifies the importance of spatial correlation to SSTP. Compared with CA-ConvLSTM, the evaluation indicators of ConvLSTM are worse. The experimental results show that when predicting a day, compared with the other three methods, the CA-ConvLSTM model has the best performance. When predicting five days, CA-ConvLSTM also has the best performance compared to SVR, LSTM and ConvLSTM. Since CA-ConvLSTM not only considers the time and space correlation of SST, but also expresses the different influences of historical SST on the predicted SST by assigning different weights, which makes the model closer to reality, contains more comprehensive information, and finally improves the accuracy of SSTP. In summary, compared with SVR, LSTM and ConvLSTM, CA-ConvLSTM has the best performance in predicting SST within one day and within five days, which verifies the effectiveness of the method.

以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的只是说明本发明的原理,在不脱离本发明精神和范围的前提下,本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The basic principles, main features and advantages of the present invention have been shown and described above. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments. What are described in the above-mentioned embodiments and the description only illustrate the principles of the present invention. Without departing from the spirit and scope of the present invention, the present invention also has various changes and improvements, and these changes and improvements all fall within the scope of the claimed invention. The protection scope of the present invention is defined by the appended claims and their equivalents.

Claims (1)

1.一种结合注意力机制的区域型海表面温度预测方法,其特征在于:包括如下步骤:1. A regional sea surface temperature forecasting method in conjunction with attention mechanism, is characterized in that: comprise the steps: 1)、将区域内每天的SST数据处理成一个矩阵,依次按时间先后进行排列,构成矩阵序列,作为CA-ConvLSTM模型的输入;1) Process the daily SST data in the area into a matrix, and arrange them sequentially according to time to form a matrix sequence, which is used as the input of the CA-ConvLSTM model; 2)、对SST矩阵进行处理,通过卷积层提取各个记录点的分布特征;2), process the SST matrix, and extract the distribution characteristics of each recording point through the convolution layer; 3)、利用注意力机制为获得的矩阵特征分配注意力权重,然后将注意力权重乘上对应的矩阵特征,得到加权特征;3) Use the attention mechanism to assign attention weights to the obtained matrix features, and then multiply the attention weights by the corresponding matrix features to obtain weighted features; 4)、最后,将加权特征作为ConvLSTM模型的输入,利用ConvLSTM训练预测模型,最终获得预测结果;4) Finally, the weighted feature is used as the input of the ConvLSTM model, and the ConvLSTM is used to train the prediction model, and finally the prediction result is obtained; 所述SST矩阵卷积是将SST数据整理为长度为|F|的数字矩阵序列后,利用卷积对矩阵进行特征提取以用于获取局部特征;卷积操作通过卷积核的移动来完成的,输出矩阵的每一个值就是输入矩阵中每个3×3区域的值与3×3卷积核对应位置的值乘积的和;The SST matrix convolution is to organize the SST data into a digital matrix sequence with a length of |F|, and then use the convolution to extract the features of the matrix to obtain local features; the convolution operation is completed by moving the convolution kernel, and each value of the output matrix is the sum of the product of the value of each 3 × 3 area in the input matrix and the value of the corresponding position of the 3 × 3 convolution kernel; 所述区域内每天的SST数据处理成一个W·H矩阵,区域内的SST序列F=F1,F2,…,F|F|,|F|表示SST序列的时间长度,Fi=W·H为该区域内第i(1≤i≤|F|,i∈Z)天的所有记录点的SST,即一个W·H的矩阵,这些矩阵构成的序列作为CA-ConvLSTM模型的输入;The daily SST data in the area is processed into a W H matrix, the SST sequence F in the area F=F 1 , F 2 ,..., F |F|, |F| represents the time length of the SST sequence, and F i =W H is the SST of all recording points on the i-th (1≤i≤|F|, i∈Z) day in the area, that is, a W H matrix, and the sequence formed by these matrices is used as the input of the CA-ConvLSTM model; 通过卷积操作得到矩阵特征序列,在预测中,使用k天的SST来预测未来一天或五天的SST,k的取值不同;从第n天开始,这k天的矩阵特征序列为M=Mn+1,Mn+2,…,Mn+k-1,Mn+k,Mi为第i(n≤i≤n+k,i∈Z)个矩阵特征,即(W-2)·(H-2)的矩阵;The matrix feature sequence is obtained through the convolution operation. In the prediction, the SST of k days is used to predict the SST of the next day or five days, and the value of k is different; starting from the nth day, the matrix feature sequence of this k day is M=M n+1 , M n+2 ,..., M n+k-1 , M n+k , M i is the ith (n≤i≤n+k, i∈Z) matrix feature, that is, the matrix of (W-2) (H-2); 所述CA-ConvLSTM模型自动学习注意力权重的方法如下:The method for automatically learning attention weights of the CA-ConvLSTM model is as follows: 首先通过池化层得到每个矩阵特征的均值,即V=[vn+1,vn+2,…,vn+k-1,vn+k],此时vi是Mi矩阵中所有数值的平均值;注意力模型Φ的定义如下:First, the mean value of each matrix feature is obtained through the pooling layer, that is, V=[v n+1 , v n+2 , ..., v n+k-1 , v n+k ], at this time v i is the average value of all values in the Mi matrix; the definition of the attention model Φ is as follows: D=tanh(w*V+b) (1)D=tanh(w*V+b) (1) A=softmaχ(W’*D+b’) (2)A=softmaχ(W’*D+b’) (2) 其中w,b是全连接层的权重和偏置,D是全连接层的输出以及softmax的输入,W’和b’是softmax的权重和偏置,A是softmax的输出,即注意力权重,是一个长度为k的向量,表示为A=[an+1,an+2,…,an+k-1,an+k,];之后,依次将注意力权重与对应矩阵特征相乘可得到一个新的特征序列,即加权特征序列,表示为N=Nn+1,Nn+2,…,Nn+k-1,Nn+k,Ni为第i个加权特征,Ni=ai*Mi,为一个(W-2)·(H-2)的矩阵;Where w, b are the weight and bias of the fully connected layer, D is the output of the fully connected layer and the input of softmax, W' and b' are the weight and bias of softmax, A is the output of softmax, that is, the attention weight, which is a vector of length k, expressed as A=[an+1, an+2,...,an+k-1, an+k,]; After that, the attention weights are multiplied by the corresponding matrix features in turn to obtain a new feature sequence, that is, the weighted feature sequence, expressed as N=Nn+1, Nn+2,...,Nn+k-1, Nn+k, Niis the i-th weighted feature, Ni=ai*Mi, is a matrix of (W-2)·(H-2); 在矩阵特征序列M进入ConvLSTM模型前,通过注意力机制自动给矩阵特征序列M进行加权,获得加权特征序列N;Before the matrix feature sequence M enters the ConvLSTM model, the matrix feature sequence M is automatically weighted through the attention mechanism to obtain the weighted feature sequence N; 所述ConvLSTM的计算公式如下:The calculation formula of the ConvLSTM is as follows: it=σ(wχit+whi*Ht-1+wci·Ct-1+bi) (3)i t =σ(w χit +w hi *H t-1 +wci·Ct-1+bi) (3) ft=σ(wχft+whf*Ht-1+wci·Ct-1+bf) (4)f t =σ(w χft +w hf *H t-1 +wci·Ct-1+bf) (4) ot=σ(wχot+wht*Ht-1+wct·Ct+bt) (6)o t =σ(w χot +w ht *H t-1 +wct·Ct+bt) (6) Ht=ot·tanh(Ct) (7)H t =o t ·tanh(C t ) (7) 其中*代表卷积操作,·代表对应相乘,σ是sigmoid激活函数。Where * represents the convolution operation, · represents the corresponding multiplication, and σ is the sigmoid activation function.
CN201910477316.2A 2019-06-03 2019-06-03 Regional sea surface temperature prediction method combined with attention mechanism Active CN110197307B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910477316.2A CN110197307B (en) 2019-06-03 2019-06-03 Regional sea surface temperature prediction method combined with attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910477316.2A CN110197307B (en) 2019-06-03 2019-06-03 Regional sea surface temperature prediction method combined with attention mechanism

Publications (2)

Publication Number Publication Date
CN110197307A CN110197307A (en) 2019-09-03
CN110197307B true CN110197307B (en) 2023-07-25

Family

ID=67753760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910477316.2A Active CN110197307B (en) 2019-06-03 2019-06-03 Regional sea surface temperature prediction method combined with attention mechanism

Country Status (1)

Country Link
CN (1) CN110197307B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909508B (en) * 2019-10-29 2024-11-26 中国石油化工股份有限公司 Real-time prediction method of heating furnace temperature field based on convolutional long short-term memory network
CN111079998B (en) * 2019-12-03 2020-12-01 华东师范大学 Traffic prediction method based on long and short time series correlation attention mechanism model
CN111209968B (en) * 2020-01-08 2023-05-12 浙江师范大学 Multi-meteorological factor model forecast temperature correction method and system based on deep learning
CN112330029A (en) * 2020-11-08 2021-02-05 上海海洋大学 Fishing ground prediction calculation method based on multilayer convLSTM
CN114417693A (en) * 2021-11-26 2022-04-29 中国石油大学(华东) Ocean three-dimensional temperature field inversion method based on deep learning
CN114399073A (en) * 2021-11-26 2022-04-26 中国石油大学(华东) A deep learning-based prediction method for ocean surface temperature field
CN114819338B (en) * 2022-04-25 2024-07-19 上海海洋大学 Multi-element sea surface temperature prediction method based on dual-attention mechanism
CN114936691B (en) * 2022-05-06 2025-02-11 河北工业大学 A temperature prediction method integrating correlation weighting and spatiotemporal attention
CN115510767B (en) * 2022-11-21 2023-10-27 四川省气象服务中心(四川省专业气象台 四川省气象影视中心) Regional air temperature prediction method based on depth space-time network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492830A (en) * 2018-12-17 2019-03-19 杭州电子科技大学 A kind of mobile pollution source concentration of emission prediction technique based on space-time deep learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9965705B2 (en) * 2015-11-03 2018-05-08 Baidu Usa Llc Systems and methods for attention-based configurable convolutional neural networks (ABC-CNN) for visual question answering
US10755174B2 (en) * 2017-04-11 2020-08-25 Sap Se Unsupervised neural attention model for aspect extraction
CN108932480B (en) * 2018-06-08 2022-03-15 电子科技大学 Distributed optical fiber sensing signal feature learning and classifying method based on 1D-CNN
CN108510132A (en) * 2018-07-03 2018-09-07 华际科工(北京)卫星通信科技有限公司 A kind of sea-surface temperature prediction technique based on LSTM

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492830A (en) * 2018-12-17 2019-03-19 杭州电子科技大学 A kind of mobile pollution source concentration of emission prediction technique based on space-time deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Samsung.R1-135232 "Prediction accuracy of link abstraction method for SLML receiver".3GPP tsg_ran\WG1_RL1.2013,(第TSGR1_75期),全文. *

Also Published As

Publication number Publication date
CN110197307A (en) 2019-09-03

Similar Documents

Publication Publication Date Title
CN110197307B (en) Regional sea surface temperature prediction method combined with attention mechanism
Li et al. Prediction for tourism flow based on LSTM neural network
Wu et al. Evolving RBF neural networks for rainfall prediction using hybrid particle swarm optimization and genetic algorithm
Kaytez et al. Forecasting electricity consumption: A comparison of regression analysis, neural networks and least squares support vector machines
Nikoo et al. Flood-routing modeling with neural network optimized by social-based algorithm
CN109711883B (en) Internet advertisement click rate estimation method based on U-Net network
CN111461463B (en) Short-term load prediction method, system and equipment based on TCN-BP
CN114662791B (en) A long-time series PM2.5 prediction method and system based on spatiotemporal attention
CN110443417A (en) Multi-model integrated load prediction method based on wavelet transformation
CN112598248A (en) Load prediction method, load prediction device, computer equipment and storage medium
CN110047015A (en) A kind of water total amount prediction technique merging KPCA and thinking Optimized BP Neural Network
CN114881343B (en) Short-term load forecasting method and device for power system based on feature selection
CN112668711B (en) Flood flow prediction method and device based on deep learning and electronic equipment
WO2019214455A1 (en) Data sequence prediction method and computing apparatus
CN114462718A (en) CNN-GRU wind power prediction method based on time sliding window
CN113010774A (en) Click rate prediction method based on dynamic deep attention model
CN114118508A (en) OD market aviation passenger flow prediction method based on space-time convolution network
CN116911459A (en) Multi-input multi-output ultra-short-term power load prediction method suitable for virtual power plant
CN117909888A (en) Intelligent artificial intelligence climate prediction method
Li et al. A novel ensemble learning approach for intelligent logistics demand management
CN119474726A (en) A method for predicting the water level of Poyang Lake based on a large model
Cheng et al. Research on prediction method based on ARIMA-BP combination model
CN119313467A (en) Stock prediction method based on temporal convolutional network fusion channel attention mechanism
Chen Stock price prediction based on the fusion of CNN-GRU combined neural network and attention mechanism
CN117708710A (en) A short-term lightweight load forecasting method for distribution stations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant