CN110069756B

CN110069756B - A Resource or Service Recommendation Method Considering User Evaluation

Info

Publication number: CN110069756B
Application number: CN201910321968.7A
Authority: CN
Inventors: 李建强; 赵亮; 赵青
Original assignee: Beijing University of Technology
Current assignee: Beijing Ruiyun Haohai Technology Co.,Ltd.
Priority date: 2019-04-22
Filing date: 2019-04-22
Publication date: 2023-07-21
Anticipated expiration: 2039-04-22
Also published as: CN110069756A

Abstract

A resource or service recommendation method considering user evaluation belongs to the field of computer artificial intelligence, and relates to a resource or service recommendation method considering user evaluation. The method includes: a user feature encoding module, a resource or service encoding module, an Encoder module, and a Decoder module. The principle of the present invention is a resource or service recommendation method that considers user evaluation. The application of multi-level coding method not only learns the original characteristics of resources or services, but also learns the role of user evaluation on resources or services. Through the attention mechanism, the relative importance of user characteristics and resources or services is obtained. The model complexity is reduced through the residual network, and the recommendation results of resources or services can be obtained faster and more accurately.

Description

A resource or service recommendation method considering user evaluation

技术领域technical field

本发明属于计算机人工智能领域，涉及一种考虑用户评价的资源或服务推荐方法。The invention belongs to the field of computer artificial intelligence and relates to a resource or service recommendation method considering user evaluation.

背景技术Background technique

资源或服务是一个平台独立的，低耦合的，自包含的、基于可编程的应用程序，可使用开放的标准来描述、发布、发现、协调和配置这些应用程序，用于开发分布式的互操作的应用程序。A resource or service is a platform-independent, low-coupling, self-contained, programmable-based application that uses open standards to describe, publish, discover, coordinate, and configure these applications for the development of distributed interoperable applications.

随着资源或服务数量的增长，大量的服务因为小访问量不为人知。资源或服务推荐就是来解决小访问量的服务无法被推荐到无法被访问和冷启动的问题。于是传统的推荐算法无法应用在资源或服务的推荐中As the number of resources or services grows, a large number of services go unnoticed due to small visits. Resource or service recommendation is to solve the problem that services with a small amount of visits cannot be recommended to the point of being inaccessible and cold start. Therefore, the traditional recommendation algorithm cannot be applied to the recommendation of resources or services.

资源或服务的特殊性在于用户的评价对于服务或资源在动态的变化且不能简单的利用数据进行计算，评价使得在资源或服务推荐与其他事物的推荐不同；通过多层次编码的方式使得资源或服务能够学习到用户的反馈评价使得推荐准确性提高。注意力机制使得神经网络模型可以很好的学习到了不同的用户特征在资源或服务推荐中的重要性程度，通过加权相加的方式得到中间向量，将中间向量通过编解码的方式得到最终的资源或服务推荐输出。The particularity of resources or services lies in the fact that the user’s evaluation of the service or resource is changing dynamically and cannot be calculated simply by using data. The evaluation makes the recommendation of resources or services different from the recommendation of other things; through multi-level coding, resources or services can learn from user feedback and evaluation, which improves the accuracy of recommendation. The attention mechanism enables the neural network model to learn the importance of different user features in resource or service recommendation. The intermediate vector is obtained by weighted addition, and the final resource or service recommendation output is obtained by encoding and decoding the intermediate vector.

在现有技术中提出的BiLSTM模型来学习用户特征与资源或服务之间的特征关系，单纯的采用了用户特征与资源或服务的特征，并没有考虑到资源或服务的特殊性即用户评价对于资源或服务推荐的作用，也没有考虑到不同的用户特征对于推荐的重要性程度。该方法虽然不够完善但为我们提供了思路即使用深度神经网络来实现对资源或服务的推荐。The BiLSTM model proposed in the prior art to learn the feature relationship between user features and resources or services simply uses user features and resource or service features, and does not take into account the particularity of resources or services, that is, the role of user evaluation for resource or service recommendations, nor does it take into account the importance of different user features for recommendations. Although this method is not perfect, it provides us with the idea of using deep neural networks to recommend resources or services.

发明内容Contents of the invention

一种考虑用户评价的资源或服务推荐方法，该方法包括：A resource or service recommendation method considering user evaluation, the method comprising:

①本发明提出了一种对于用户特征与资源或服务的多层次编码方法，该方格通过对用户特征的语义输入做word-embedding与position-embedding两种编码方式，将两种编码结果按位相加得到最终的用户特征的输入向量；对资源或服务数据做onehot-embedding、QOE-embedding、QOS-embedding三种编码方式，将三种编码结果按位相加得到最终的资源或服务的输入向量。① The present invention proposes a multi-level coding method for user characteristics and resources or services. The grid adopts two coding methods of word-embedding and position-embedding for the semantic input of user characteristics, and adds the two coding results bitwise to obtain the final input vector of user characteristics; performs three coding methods of onehot-embedding, QOE-embedding, and QOS-embedding for resource or service data, and adds the three coding results bit by bit to obtain the final resource or service input vector.

②本发明提出了一种基于Mulit-Head Self-Attention机制的Encoder-Decoder结构，将用户特征的输入向量作为Encoder端的输入，将资源或服务的输入向量作为Decoder端的输入，将Encoder端的输出同样作为Decoder端的输入。②The present invention proposes an Encoder-Decoder structure based on the Mulit-Head Self-Attention mechanism. The input vector of user characteristics is used as the input of the Encoder, the input vector of resources or services is used as the input of the Decoder, and the output of the Encoder is also used as the input of the Decoder.

③Encoder结点的结构为Encoder Mulit-Head Self-Attention层、全连接层，在每一层均使用了残差结构；Encoder端为相同的若干个Encoder结点串行组成。③The structure of the Encoder node is an Encoder Mulit-Head Self-Attention layer and a fully connected layer, and a residual structure is used in each layer; the Encoder end is composed of the same number of Encoder nodes in series.

④Decoder结点的结构为Decoder Mulit-Head Self-Attention层、Encoder-Decoder Self-Attention层、全连接层，在每一层均使用了残差结构；Decoder端为相同的若干个Decoder结点串行组成。④The structure of the Decoder node is a Decoder Mulit-Head Self-Attention layer, an Encoder-Decoder Self-Attention layer, and a fully connected layer, and a residual structure is used in each layer; the Decoder end is composed of the same number of Decoder nodes in series.

⑤Decoder端的输出向量通过Softmax之后，将概率值最高的前K个作为本次资源或服务的推荐结果。⑤After the output vectors of the Decoder end pass Softmax, the top K with the highest probability values are used as the recommendation results of this resource or service.

本发明的原理是一种考虑用户评价的资源或服务推荐方法，应用多层次编码方法不仅学习了资源或服务本来的特征还学习到了用户评价对于资源或服务的作用，通过注意力机制得到了用户特征与资源或服务的相关重要性程度，通过残差网络降低了模型复杂度且更快更准确的得到资源或服务的推荐结果。The principle of the present invention is a resource or service recommendation method that considers user evaluation. The application of the multi-level coding method not only learns the original characteristics of resources or services, but also learns the role of user evaluation on resources or services. Through the attention mechanism, the relative importance of user characteristics and resources or services is obtained. The model complexity is reduced through the residual network, and the recommendation results of resources or services can be obtained faster and more accurately.

为达到以上发明目的，本发明采用如下的技术方案：In order to achieve the above object of the invention, the present invention adopts the following technical solutions:

一种考虑用户评价的资源或服务推荐方法，包括：用户特征编码模块、资源或服务编码模块、Encoder模块、Decoder模块。其中Encoder与Decoder模块中使用的是本发明提出的基于Mulit-Head Self-Attention模型。A resource or service recommendation method considering user evaluation, comprising: a user feature encoding module, a resource or service encoding module, an Encoder module, and a Decoder module. The Encoder and Decoder modules use the Mulit-Head Self-Attention model proposed by the present invention.

用户特征编码模块，该模块将用户特征语句通过分词之后进行多层次编码，其中word-embedding使用word2vec中的Skip-Gram形式，position-embedding使用cos、sin函数得到位置编码，将两种编码方式按位相加得到用户特征的输入向量。User feature encoding module, which performs multi-level encoding of user feature sentences after word segmentation, where word-embedding uses the Skip-Gram form in word2vec, position-embedding uses cos and sin functions to obtain position encoding, and adds the two encoding methods bit by bit to obtain the input vector of user features.

资源或服务编码模块，该模块将用户特征语句所对应的资源或服务数据进行多层次编码，其中onehot-embedding编码将该用户所使用的所有资源或服务信息设置为1其他设置为0，QOE-embedding使用softmax得到当前资源或服务向量的重要性程度概率、QOS-embedding使用softmax得到当前资源或服务向量的重要性程度概率，将三种编码方式按位相乘得到最终资源或服务的输入向量。Encoder模块，将用户特征向量作为输入，得到中间的隐藏向量。Encoder结点的结构为Encoder Mulit-Head Self-Attention层、全连接层，在每一层均使用了残差结构；Encoder端为相同的若干个Encoder结点串行组成。Resource or service encoding module, this module performs multi-level encoding of the resource or service data corresponding to the user's feature statement, in which onehot-embedding encoding sets all resource or service information used by the user to 1 and other settings to 0, QOE-embedding uses softmax to obtain the probability of the importance degree of the current resource or service vector, QOS-embedding uses softmax to obtain the probability of the importance degree of the current resource or service vector, and multiplies the three encoding methods bit by bit to obtain the input vector of the final resource or service. The Encoder module takes the user feature vector as input to obtain the intermediate hidden vector. The structure of the Encoder node is an Encoder Mulit-Head Self-Attention layer and a fully connected layer, and a residual structure is used in each layer; the Encoder end is composed of the same number of Encoder nodes in series.

Decoder模块，将资源或服务向量与Encoder端的中间隐藏向量作为输入，得到最终得到输出向量通过softmax得到推荐结果。Decoder结点的结构为Decoder Mulit-HeadSelf-Attention层、Encoder-Decoder Self-Attention层、全连接层，在每一层均使用了残差结构；Decoder端为相同的若干个Decoder结点串行组成。The Decoder module takes the resource or service vector and the intermediate hidden vector of the Encoder as input, and obtains the final output vector to obtain the recommendation result through softmax. The structure of the Decoder node is a Decoder Mulit-HeadSelf-Attention layer, an Encoder-Decoder Self-Attention layer, and a fully connected layer. A residual structure is used in each layer; the Decoder end is composed of the same number of Decoder nodes in series.

本发明提出的考虑用户评价的资源或服务推荐方法通过Mulit-Head Self-Attention机制的编解码模型学习到了用户特征与资源或服务之间的相关性，通过残差网络降低了模型复杂度提高了推荐的准确性。The resource or service recommendation method considering user evaluation proposed by the present invention learns the correlation between user characteristics and resources or services through the encoding and decoding model of the Mulit-Head Self-Attention mechanism, and reduces the model complexity through the residual network to improve the accuracy of recommendation.

附图说明Description of drawings

图1一种考虑用户评价的资源或服务推荐的整体框架；Figure 1. An overall framework for resource or service recommendation considering user evaluation;

图2基于Mulit-Head Self-Attention机制的Encoder节点结构；Figure 2 Encoder node structure based on the Mulit-Head Self-Attention mechanism;

图3基于Mulit-Head Self-Attention机制的Decoder节点结构。Figure 3 is based on the Decoder node structure of the Mulit-Head Self-Attention mechanism.

具体实施方式Detailed ways

下面将详细描述本发明各个方面的特征和示例性实施例：Features and exemplary embodiments of various aspects of the invention are described in detail below:

本发明将用户特征与资源或服务特征数据作为输入，通过基于Mulit-Head Self-Attention的神经网络模型生成式的得到资源或服务的推荐结果，提高了准确率降低了模型复杂度提高了并行计算能力。附图1中为本发明的整体框架，附图2中为Encoder节点结构，附图3中为Decoder节点结构。The present invention takes user features and resource or service feature data as input, and generates recommendation results of resources or services through a neural network model based on Mulit-Head Self-Attention, which improves accuracy, reduces model complexity, and improves parallel computing capabilities. Figure 1 shows the overall framework of the present invention, Figure 2 shows the Encoder node structure, and Figure 3 shows the Decoder node structure.

用户特征编码模块(1)：该模块将用户输入的语句转变为Encoder端的输入向量。具体做法为将输入的语句以词的级别先进行word-embedding，word-embedding模型采用Google公司开源的词向量库。对输入的语句做position-embedding，其具体公式如下：User feature encoding module (1): This module converts the sentence input by the user into an input vector at the Encoder end. The specific method is to first perform word-embedding on the input sentence at the word level, and the word-embedding model uses Google's open-source word vector library. Do position-embedding on the input sentence, the specific formula is as follows:

其中pos为单词在句子中的位置，k为单词的维度，d_model为模型常数，设为100，PE结果为第pos个单词第k个位置上的位置编码结果。Where pos is the position of the word in the sentence, k is the dimension of the word, d _model is the model constant, set to 100, and the PE result is the position encoding result of the kth position of the word pos.

将每个单词计算得到的word-embedding向量与position-embedding向量按位相加得到用户特征的输入向量X_embedding。Add the word-embedding vector calculated for each word to the position-embedding vector to obtain the input vector X _embedding of user features.

资源或服务编码模块(2)：该模块将资源或服务输入转变为Decoder端的输入向量。用该用户所对应的资源或服务数据按照有无采用onehot-embedding，即当存在某个资源或服务时在该资源或服务所对应的编码位置设置为1否则设置为0。QOE-embedding为用户体验质量编码，采用softmax的形式计算该用户使用的所有资源或服务中的重要性加权评分，其具体的公式如下所示：Resource or service encoding module (2): This module converts resource or service input into an input vector on the Decoder side. Use onehot-embedding according to the resource or service data corresponding to the user, that is, when there is a resource or service, set the coding position corresponding to the resource or service to 1; otherwise, set it to 0. QOE-embedding is the quality of user experience code, and uses the form of softmax to calculate the importance weighted score of all resources or services used by the user. The specific formula is as follows:

其中m为第m个资源或服务的序号，QOE_m为第m个资源或服务的用户质量评分，通过softmax的得到该资源或服务的重要性加权评分得到所有的资源或服务的重要性加权评分得到QOE-embedding编码向量。Where m is the serial number of the mth resource or service, QOE _m is the user quality score of the mth resource or service, and the importance weighted score of the resource or service is obtained through softmax Get the weighted scores of the importance of all resources or services to get the QOE-embedding code vector.

QOS-embedding为服务质量编码，采用softmax的形式计算该用户使用的所有资源或服务的中的重要性加权评分，其具体公式如下所示：QOS-embedding is the quality of service code, and uses the form of softmax to calculate the importance weighted score of all resources or services used by the user. The specific formula is as follows:

其中p为第p个资源或服务的序号，QOS_p为第p个资源或服务的服务质量评分，通过softmax的得到该资源或服务的重要性加权评分得到所有的资源或服务的重要性加权评分得到QOS-embedding编码向量。Where p is the serial number of the pth resource or service, QOS _p is the service quality score of the pth resource or service, and the importance weighted score of the resource or service is obtained through softmax Get the weighted scores of the importance of all resources or services to get the QOS-embedding code vector.

将得到onehot-embedding与QOE-embedding与QOS-embedding按位相乘得到资源或服务的输入向量Y_embedding。Multiply the obtained onehot-embedding, QOE-embedding, and QOS-embedding bitwise to obtain the input vector Y _embedding of the resource or service.

Encoder模块(3)：该模块以用户特征输入向量为输出，输出中间向量给Decoder模块。附图2为具体的Encoder节点结构。在Encoder端有num_Encoder个Encoder节点组成，其设置为6个。具体的Encoder节点包括Mulit-Head Self-Attention层、全连接层两层结构，并且在每一层均采用了残差网络结构。在Mulit-Head Self-Attention层中共由num_Attention个注意力矩阵组成，其设置为6个。将用户特征输入向量按位分割为6个短向量，将每个短向量输入对应的注意力矩阵中得到所对应的短注意向量，将6个短注意力向量拼接在一起通过残差网络得到该节点的最终注意力向量。将注意力向量通过具有残差网络的全连接层得到该节点的输出向量。对于前一个Encoder节点的输出向量作为输入向量输入到后一个Encoder节点。将最后一个Encoder节点的输出向量作为Encoder端的输出向量输入到Decoder端。Encoder module (3): This module takes the user feature input vector as the output, and outputs the intermediate vector to the Decoder module. Figure 2 shows the specific Encoder node structure. There are num _Encoder Encoder nodes on the Encoder side, which is set to 6. The specific Encoder node includes a Mulit-Head Self-Attention layer and a fully connected layer two-layer structure, and a residual network structure is used in each layer. In the Mulit-Head Self-Attention layer, there are num _Attention attention matrices, which are set to 6. Divide the user feature input vector into 6 short vectors, input each short vector into the corresponding attention matrix to obtain the corresponding short attention vector, splice the 6 short attention vectors together and obtain the final attention vector of the node through the residual network. Pass the attention vector through a fully connected layer with a residual network to get the output vector of the node. The output vector for the previous Encoder node is input to the next Encoder node as the input vector. The output vector of the last Encoder node is input to the Decoder end as the output vector of the Encoder end.

Encoder Mulit-Head Self-Attention层的具体公式如下所示：The specific formula of the Encoder Mulit-Head Self-Attention layer is as follows:

其中X_embedding ⁱ为在Encoder端的第i个分割之后的短向量，i为Encoder端的短向量序号与对应的注意力矩阵序号。Among them, X _embedding ⁱ is the short vector after the i-th segmentation on the Encoder side, and i is the short vector number on the Encoder side and the corresponding attention matrix number.

在公式(5)中，为在Encoder节点中的Key矩阵，/>为得到的在Encoder节点key向量。In formula (5), is the Key matrix in the Encoder node, /> It is the obtained key vector in the Encoder node.

在公式(6)中，为在Encoder节点Query矩阵，/>为得到的在Encoder节点query向量。In formula (6), For the Query matrix in the Encoder node, /> For the obtained query vector in the Encoder node.

在公式(7)中，为在Encoder节点Value矩阵，/>为得到的在Encoder节点value向量。In formula (7), For the Value matrix in the Encoder node, /> For the obtained Encoder node value vector.

在公式(8)中，为在Encoder节点的第i个注意力矩阵得到注意力短向量，将所有的注意力短向量拼接起来得到该节点的最终注意力向量Attention_E。In formula (8), In order to obtain the attention short vector in the i-th attention matrix of the Encoder node, all the attention short vectors are spliced together to obtain the final attention vector Attention _E of the node.

在公式(9)中，将得到的最终注意力向量Attention_E与输入向量X_embedding按位相加得到该层的输出向量 In formula (9), add the obtained final attention vector Attention _E to the input vector X _embedding to obtain the output vector of this layer

全连接层的具体公式如下所示：The specific formula of the fully connected layer is as follows:

在公式(10)中，为全连接层矩阵，b^x为该公式中的偏置，X_out为该节点的输出向量。对于每一个Encoder节点的输出向量都作为输入向量输入到下一个Encoder节点中，对于最后一个Encoder节点其输出向量将作为中间向量输入到Decoder端中。In formula (10), is the fully connected layer matrix, b ^x is the bias in the formula, and X _out is the output vector of the node. The output vector of each Encoder node is input to the next Encoder node as an input vector, and the output vector of the last Encoder node will be input to the Decoder as an intermediate vector.

在Encoder端中的所有矩阵与偏置均采用截断的高斯分布输入网络，当参数收敛或得到最大迭代次数50次时停止训练。在Encoder端中的所有向量长度均设置为500。All matrices and offsets in the Encoder end are input to the network using a truncated Gaussian distribution, and the training stops when the parameters converge or the maximum number of iterations is 50. All vector lengths in the Encoder side are set to 500.

Decoder模块(4)：该模块以资源或服务的输入向量与Encoder端的输出向量为输入，输出最终的隐层向量得到最后的推荐结果。Decoder端由num_Decoder个Decoder结点，设置为6。Decoder结点的结构为Decoder Mulit-Head Self-Attention层、Encoder-DecoderSelf-Attention层、全连接层，在每一层均使用了残差结构；在Decoder Mulit-Head Self-Attention层中共由num_Attention个注意力矩阵组成，其设置为6。将资源或服务输入向量按位分割为6个短向量，将每个短向量输入对应的注意力矩阵中得到所对应的短注意向量，将6个短注意力向量拼接在一起通过残差网络得到该节点的最终注意力向量。将注意力向量通过具有残差网络的全连接层得到该节点的输出向量。对于前一个Decoder节点的输出向量作为输入向量输入到后一个Decoder节点。最后的一个Decoder节点的输出向量作为最终的输出向量，通过softmax层得到最终分类向量，将分类向量中数值最大的前K个资源或服务作为该用户的推荐资源或服务，K设置为6。Decoder module (4): This module takes the input vector of the resource or service and the output vector of the Encoder as input, and outputs the final hidden layer vector to obtain the final recommendation result. The Decoder side consists of num _Decoder Decoder nodes, set to 6. The structure of the Decoder node is a Decoder Mulit-Head Self-Attention layer, an Encoder-DecoderSelf-Attention layer, and a fully connected layer, and a residual structure is used in each layer; in the Decoder Mulit-Head Self-Attention layer, it consists of num _Attention attention matrices, which are set to 6. Divide the resource or service input vector into 6 short vectors bit by bit, input each short vector into the corresponding attention matrix to obtain the corresponding short attention vector, splicing the 6 short attention vectors together to obtain the final attention vector of the node through the residual network. Pass the attention vector through a fully connected layer with a residual network to get the output vector of the node. The output vector for the previous Decoder node is input to the next Decoder node as the input vector. The output vector of the last Decoder node is used as the final output vector, and the final classification vector is obtained through the softmax layer, and the top K resources or services with the largest values in the classification vector are used as the recommended resources or services for the user, and K is set to 6.

Decoder Mulit-Head Self-Attention层的具体公式如下所示：The specific formula of the Decoder Mulit-Head Self-Attention layer is as follows:

其中Y_embedding ^j为在Decoder端的第j个分割之后的短向量，j为Decoder端短向量序号与对应的注意力矩阵序号。Where Y _embedding ^j is the short vector after the j-th segmentation on the Decoder side, and j is the sequence number of the short vector at the Decoder side and the corresponding attention matrix sequence number.

在公式(11)中，为在Dncoder节点中的Key矩阵，/>为得到的在Dncoder节点key向量。In formula (11), is the Key matrix in the Dncoder node, /> For the obtained key vector in the Dncoder node.

在公式(12)中，为在Dncoder节点Query矩阵，/>为得到的在Dncoder节点query向量。In formula (12), For the Query matrix in the Dncoder node, /> For the obtained query vector in the Dncoder node.

在公式(13)中，为在Dncoder节点Value矩阵，/>为得到的在Dncoder节点value向量。In formula (13), For the Value matrix in the Dncoder node, /> For the obtained value vector in the Dncoder node.

在公式(14)中，为在Dncoder节点的第j个注意力矩阵得到注意力短向量，将所有的注意力短向量拼接起来得到该节点的最终注意力向量Attention_D。In formula (14), In order to obtain the attention short vector in the jth attention matrix of the Dncoder node, all the attention short vectors are spliced together to obtain the final attention vector Attention _D of the node.

在公式(15)中，将得到的最终注意力向量Attention_D与输入向量Y_embedding按位相加得到该层的输出向量 In formula (15), add the obtained final attention vector Attention _D to the input vector Y _embedding to obtain the output vector of this layer

Encoder-Decoder Self-Attention层的具体公式如下所示：The specific formula of the Encoder-Decoder Self-Attention layer is as follows:

在公式(16)中，为该层权重矩阵，b^y为该层偏置，/>为上一层的最终注意力向量输出向量，X_out为Encoder节点的输出向量，Y_ED为该层的输出向量。In formula (16), is the layer weight matrix, b ^y is the layer bias, /> is the final attention vector output vector of the previous layer, X _out is the output vector of the Encoder node, and Y _ED is the output vector of this layer.

在公式(17)中，为该层的权重矩阵，Y_ED为上一层的输出向量，b^y为该层的偏置，Y_out为该层的输出向量。In formula (17), is the weight matrix of this layer, Y _ED is the output vector of the previous layer, b ^y is the bias of this layer, Y _out is the output vector of this layer.

每一个Decoder节点的输出向量作为下一个Decoder节点的输入向量，最后一个的Decoder节点的输出向量Y_outfinally为最终的输出向量。将最终的输出向量Y_outfinally做softmax操作，将分类向量中数值最大的前K个资源或服务作为该用户的推荐资源或服务，将K设置为6。The output vector of each Decoder node is used as the input vector of the next Decoder node, and the output vector Y _outfinally of the last Decoder node is the final output vector. Perform the softmax operation on the final output vector Y _outfinally , and use the top K resources or services with the largest values in the classification vector as the recommended resources or services for the user, and set K to 6.

在Decoder端中的所有矩阵与偏置采用截断的高斯分布做初始化，当参数收敛或者达到最大迭代次数50次时结束训练。在Decoder端中所有的向量长度均设置为500。最终的输出向量长度为资源或服务数据个数的长度，设置为2000。All matrices and offsets in the Decoder end are initialized with a truncated Gaussian distribution, and the training ends when the parameters converge or the maximum number of iterations reaches 50. All vector lengths are set to 500 on the Decoder side. The length of the final output vector is the length of the number of resource or service data, which is set to 2000.

Claims

1. A resource or service recommendation method considering user evaluation comprises four modules: the system comprises a user characteristic coding module (1), a resource or service coding module (2), an Encoder end (3) and a Decoder end (4);

(1) And a user characteristic coding module:

the module carries out multi-level coding on user characteristic sentences after word segmentation, wherein word-ebedding uses Skip-Gram forms in word2vec, position-ebedding uses cos and sin functions to obtain position codes, and the two coding modes are added according to bits to obtain user characteristic vectors;

(2) Resource or service encoding module:

the module carries out multi-level coding on the resource or service data corresponding to the user characteristic statement, wherein the onehot-embedding coding sets all the resource or service information used by the user to be 1 and the other is set to be 0, QOE-embedding uses softmax to obtain the importance degree probability of the current resource or service vector, QOS-embedding uses softmax to obtain the importance degree probability of the current resource or service vector, and the three coding modes are multiplied by bits to obtain the input vector of the final resource or service;

(3) The Encoder terminal:

the module takes the user characteristic vector as input to obtain an intermediate hidden vector; the structure of the Encoder node is an Encoder Mulit-Head Self-Attention layer and a full connection layer, and each layer uses a residual structure; the end of the Encoder consists of a plurality of identical Encoder nodes in series;

(4) The Decoder end:

the module takes the resource or service vector and the intermediate hidden vector of the Encoder end as input to obtain an output vector finally, and a recommendation result is obtained through softmax; the structure of the Decoder node is a Decoder Mulit-Head Self-attribute layer, an Encode-Decoder Self-attribute layer and a full connection layer, and each layer uses a residual structure; the Decoder end is formed by a plurality of same Decoder nodes in series;

step (4), the concrete steps are as follows:

the module takes the input vector of the resource or the service and the output vector of the Encoder end as inputs, and outputs the final hidden layer vector to obtain the final recommendation result; the Decoder end is formed by num _Decoder A Decoder node set to 6; the structure of the Decoder node is a Decoder Mulit-Head Self-attribute layer, an Encode-Decoder Self-attribute layer and a full connection layer, and each layer uses a residual structure; co-existence of num in the Decoder Mulit-Head Self-Attention layer _Attention A plurality of attention moment arrays, which are set to 6; dividing the input vector of the resource or service into 6 short vectors according to the bit, inputting each short vector into the corresponding attention moment arrayThe corresponding short attention vectors are obtained, the 6 short attention vectors are spliced together, and the final attention vector of the node is obtained through a residual error network; the attention vector is passed through a full connection layer with a residual error network to obtain an output vector of the node; the output vector of the former Decoder node is used as an input vector to be input to the latter Decoder node; the output vector of the last Decoder node is used as a final output vector, a final classification vector is obtained through a softmax layer, the first K resources or services with the largest numerical value in the classification vector are used as recommended resources or services of the user, and K is set to be 6;

the specific formula of the Decoder Mulit-Head Self-attribute layer is as follows:

wherein Y is _embedding ^j The j is the serial number of the short vector of the Decoder end and the serial number of the corresponding attention moment array;

in the formula (11) of the present invention,for the Key matrix in the Dncoder node,>obtaining a key vector at a Dncoder node;

in the formula (12) of the present invention,for the Query matrix at Dncoder node,>the query vector is obtained at the Dncoder node;

in the case of the formula (13),for Value matrix at Dncoder node,/->Obtaining a value vector of the node at the Dncoder;

in the case of the formula (14),to obtain the Attention short vector at the j-th Attention matrix of the Dncoder node, all the Attention short vectors are spliced to obtain the final Attention vector Attention of the node _D ；

In equation (15), the resulting final Attention vector Attention _D And input vector Y _embedding Adding bits to obtain the output vector of the layer

The specific formula of the Encoder-Decoder Self-attribute layer is shown below:

in the formula (16) of the present invention,for the layer weight matrix, b ^y Bias for the layer->Outputting a vector, X, for the final attention vector of the upper layer _out For the output vector of the Encoder node, Y _ED An output vector for the layer;

the specific formula of the full link layer is as follows:

in the formula (17) of the present invention,for the weight matrix of the layer, Y _ED B is the output vector of the upper layer ^y For the bias of the layer, Y _out An output vector for the layer;

the output vector of each Decoder node is used as the input vector of the next Decoder node, and the output vector Y of the last Decoder node _outfinally Is the final output vector; will finally output vector Y _outfinally Performing softmax operation, and setting the first K resources or services with the largest numerical value in the classification vector as recommended resources or services of the user, wherein K is set to be 6;

initializing all matrixes and bias in a Decoder end by adopting truncated Gaussian distribution, and ending training when parameters converge or reach the maximum iteration number of 50 times; all vector lengths in the Decoder end are set to 500; the final output vector length is the length of the number of resources or service data, and is set to 2000.

2. The resource or service recommendation method considering user evaluation according to claim 1, wherein step (1) is specifically as follows:

the module converts sentences input by a user into input vectors of an Encoder terminal; firstly, word-embedding is carried out on an input sentence in the word level, and a word-embedding model adopts a word vector library of an open source of Google company; the position-embedding is carried out on the input sentence, and the specific formula is as follows:

where pos is the position of the word in the sentence, k is the dimension of the word, d _model Setting the model constant as 100, wherein the PE result is the position coding result at the kth position of the pos word;

adding word-casting vectors obtained by calculating each word and position-casting vectors according to the bit to obtain an input vector X of the user characteristic _embedding 。

3. The resource or service recommendation method considering user evaluation according to claim 1, wherein step (2) is specifically as follows:

the module converts the resource or service input into an input vector of a Decoder terminal; setting the corresponding coding position of the resource or service to be 1 when a certain resource or service exists or setting the corresponding coding position to be 0 when the resource or service exists according to the fact that whether the onehot-sounding is adopted or not by using the resource or service data corresponding to the user; QOE-unbinding encodes the quality of experience of the user, and calculates importance weighted scores in all resources or services used by the user in the form of softmax, wherein a specific formula is as follows:

where m is the sequence number of the mth resource or service, QOE _m For the user quality score of the mth resource or service, the importance weighted score of the resource or service is obtained through softmaxObtaining importance weighted scores of all resources or services to obtain QOE-ebedding coding vectors;

QOS-unbinding is a quality of service code, and the importance weighted score in all resources or services used by the user is calculated in the form of softmax, and the specific formula is as follows:

where p is the sequence number of the p-th resource or service, QOS _p For the quality of service score of the p-th resource or service, the importance weighting score of the resource or service is obtained through softmaxObtaining importance weighted scores of all resources or services to obtain QOS-ebedding coding vectors;

the obtained onehot-email and QOE-email and QOS-email are bit-multiplied to obtain an input vector Y of a resource or service _embedding 。

4. The resource or service recommendation method considering user evaluation according to claim 1, wherein step (3) is specifically as follows:

the module takes the input vector of the user characteristic as output, and outputs an intermediate vector to a Decoder terminal; having num at the end of the Encoder _Encoder The number of the Encoder nodes is 6; the specific Encoder node comprises a Mulit-Head Self-attribute layer and a full-connection layer two-layer structure, and each layer adoptsResidual error network structure; co-production of num in the Mulit-Head Self-Attention layer _Attention A plurality of attention moment arrays, which are set to 6; dividing a user characteristic input vector into 6 vectors according to bits, inputting each vector into a corresponding attention matrix to obtain a corresponding short attention vector, splicing the 6 short attention vectors together, and obtaining a final attention vector of the node through a residual error network; the attention vector is passed through a full connection layer with a residual error network to obtain an output vector of the node; the output vector of the former Encoder node is input to the latter Encoder node as an input vector; the output vector of the last Encoder node is used as the output vector of the Encoder terminal to be input to the Decode terminal;

the specific formula of the Encode Mulit-Head Self-attribute layer is as follows:

wherein X is _embedding ⁱ For the short vector after the i-th segmentation of the Encoder end, i is the short vector sequence number of the Encoder end and the corresponding attention moment array sequence number;

in the formula (5) of the present invention,for the Key matrix in the Encoder node,>a key vector at the Encoder node is obtained;

in the formula (6) of the present invention,for the Query matrix at the Encoder node,>a query vector at the Encoder node is obtained;

in the formula (7) of the present invention,for Value matrix at the Encoder node, </i >>A value vector at the Encoder node is obtained;

in the formula (8) of the present invention,to obtain an Attention short vector at the ith Attention matrix of an Encoder node, all Attention short vectors are concatenated to obtain the final Attention vector Attention of the node _E ；

In equation (9), the resulting final Attention vector Attention _E And input vector X _embedding Adding bits to obtain the output vector of the layer

The specific formula of the full link layer is as follows:

in the formula (10) of the present invention,is a full connection layer matrix, b ^x For the bias in this formula, X _out An output vector for the node; the output vector of each Encoder node is input into the next Encoder node as an input vector, and the output vector of the last Encoder node is input into a Decoder terminal as an intermediate vector;

all matrixes and offsets in the Encoder end adopt truncated Gaussian distribution input networks, and training is stopped when parameters are converged or the maximum iteration number is 50 times; all vector lengths in the Encoder end are set to 500.