CN109033463B

CN109033463B - Community question-answer content recommendation method based on end-to-end memory network

Info

Publication number: CN109033463B
Application number: CN201811008620.4A
Authority: CN
Inventors: 陈细玉; 林穗; 孙为军
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2018-08-28
Filing date: 2018-08-28
Publication date: 2021-11-26
Anticipated expiration: 2038-08-28
Also published as: CN109033463A

Abstract

The invention discloses a community question and answer content recommendation method based on an end-to-end memory network. First, a title is obtained as a data set, and the data set is preprocessed, and the data set is divided into a training set, a verification set and a test set; Sets to build an end-to-end memory network model; finally the model is optimized using stochastic gradient descent (SGD) with AdaGrad update rules.

Description

Community question-answer content recommendation method based on end-to-end memory network

Technical Field

The invention relates to the field of content recommendation, in particular to a community question and answer content recommendation method based on an end-to-end memory network.

Background

The network community question-answering is a main use platform for solving problems and sharing knowledge and experience of people at present, for example, knowing that the information range is wide, but not everyone is interested, so that the content which the user is interested in needs to be recommended to the user, and the viscosity of the user is increased.

Disclosure of Invention

The invention aims to solve one or more defects and provides a community question and answer content recommendation method based on an end-to-end memory network.

In order to realize the purpose, the technical scheme is as follows:

a community question-answer content recommendation method based on an end-to-end memory network comprises the following steps:

s1: acquiring a title as a data set, preprocessing the data set, and dividing the data set into a training set, a verification set and a test set;

s2: establishing an end-to-end memory network model according to the data set;

s3: a random gradient descent (SGD) optimization model with AdaGrad update rules was used.

Preferably, the data set of step S1 is divided into training set, verification set and test set on average.

Preferably, the title in step S1 is a content title of the user browsing and historical behavior in the community question and answer.

Preferably, the end-to-end memory model comprises a single layer model and a multilayer model; wherein the single-layer model comprises a memory component, an input component, and an output component;

wherein the memory component represents: title set D ═ x for storing historical behaviors₁,x₂...x_nWill each word w using a matrix A of size dim x V |_ij∈x_iMemory vector { a) embedded into d-dimension_ijIn such a that_ij＝Aw_ij. Entire sentence set { x_iUsing matrix A to convert into memory vector of dimension d { a }_i}；

The input component represents: the forward browsing title q is converted into vector B by B matrix, B is calculated and a is memorized_iThe matching degree between the two formulas is as follows: p is a radical of_i＝Softmax(b^Ta_i) (ii) a Wherein Softmax (z)_i)＝e^Zi/∑_je^ZjP is the probability vector on the input;

the output component represents: title set of historical behavior D ═ { x ═ x₁,x₂...x_nD, using a matrix C to convert into an output vector C with dimension d_iThe output o is the output vector c_iAnd probability vector weighted sum, formula:

final prediction f ═ Softmax (W (o + b));

the multi-layer model is that the header q of the input element is the sum of the previous-hop input header b and the output o, i.e. the input of the next layer k +1 is the output o from the layer k^kAnd input b^kThe formula: b^k+1＝o^k+bk；

Wherein each layer has its own embedded matrix a^k，C^kFor embedding input { x_i}。

Preferably, the multi-layer model further comprises a sentence representation, each sentence x_i＝{x_i1，x_i2，...，x_inEmbed each word and sum the resulting vectors, and add a time representation, the word vector being a 0-1 vector of length V, such that a_i＝∑_jAx_ij+T_A(i) (ii) a Wherein T is_A(i) Is a special matrix T encoding time information_ARow i of (1); all in oneMatrix Tc, ci ═ Σ for output embedding_jCx_ij+T_C(i)。，T_AAnd T_CAre learned during training.

Preferably, the multilayer model further comprises word similarity, and in the currently browsed title q in the first layer, keywords with similarity exceeding 0.8 in q in memory are added into q by using the word similarity, so that the situation that the weight of the titles with different words is too low while the keywords are similar or similar to the keywords in q in memory is avoided;

selecting keywords of the title being browsed from a corpus consisting of all preprocessed titles, and carrying out similarity calculation between every two words and the rest keywords to calculate a formula:

where yi is the coefficient for w1 and w2 branching at the beginning of the ith layer.

Preferably, the evaluation criteria for the model are accuracy, recall, and F1 score.

Compared with the prior art, the invention has the beneficial effects that:

the end-to-end memory network can remember a large amount of user behaviors and add time, so that the interest prediction of the user is more accurate and reliable. And reducing supervision items by adopting end-to-end training. The attention mechanism is included, so that different titles have different weights, the predicted interest points can be sequenced, the recommended emphasis points are different, the interest points with large weights are ranked highly, and the recommended content of the interest points is more than that of other interest points. And the word similarity is added, so that the prediction is more accurate.

Drawings

FIG. 1 is a flow chart of the present invention.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the patent;

the invention is further illustrated below with reference to the figures and examples.

Example 1

Referring to fig. 1, a community question-answer content recommendation method based on an end-to-end memory network includes the following steps:

s2: establishing an end-to-end memory network model according to the data set;

For example, it is known that questions and their answers are more likely to be shared than known in hundredths, rather than being answered. Each question is short and descriptive, so the question is the title. All the acquired titles need to be preprocessed, each title is firstly subjected to word segmentation, stop words and special characters such as 'a' and 'a' are then deleted, and because many 'reasons', 'how' and 'experiences' in questions are known, the words are also deleted, so that the situation that the weight of common irrelevant words is too large, required key words are covered, the maximum length of sentences is set to be 50, and the exceeding contents need to be cut is avoided. The data set is evenly divided into a training set, a validation set and a test set.

The method comprises the steps of selecting titles of historical behaviors of a user as memory in a model, wherein the historical behaviors comprise removing the latest browsed titles and agreeing titles which are browsed, answering the titles, paying attention to the titles, selecting the latest 5 titles according to time, and recommending the content related to the latest interest of the user, so that the selected titles are sorted according to the operation time of the user to form a title set D, the test effect is better when the embedding dimension of each title is 300-500-.

In this embodiment, the end-to-end memory model includes a single-layer model and a multi-layer model; wherein the single-layer model comprises a memory component, an input component, and an output component;

wherein the memory component represents: title set D ═ x for storing historical behaviors₁,x₂...x_nUsing a matrix of size dim x V |)A will each word w_ij∈x_iMemory vector { a) embedded into d-dimension_ijIn such a that_ij＝Aw_ij. Entire sentence set { x_iUsing matrix A to convert into memory vector of dimension d { a }_i}；

final prediction f ═ Softmax (W (o + b));

the multi-layer model is that the header q of the input element is the sum of the previous-hop input header b and the output o, i.e. the input of the next layer k +1 is the output o from the layer k^kAnd input b^kThe formula: b^k+1＝o^k+b^k；

In this embodiment, the multi-layer model further includes sentence representations, each sentence x_i＝{x_i1，x_i2，...，x_inEmbed each word and sum the resulting vectors, and add a time representation, the word vector being a 0-1 vector of length V, such that a_i＝∑_jAx_ij+T_A(i) (ii) a Wherein T is_A(i) Is a special matrix T encoding time information_ARow i of (1); similarly, the matrix Tc, ci ═ Σ for output embedding_jCx_ij+T_C(i)。，T_AAnd T_CAre all in the training periodAnd (4) learning.

Wherein each matrix such as A, B, C, W is also obtained by training, and the first jump matrix A is used for reducing the number of parameters for convenient training¹B, last hop matrix W^T＝C^KThe other memory matrix A of each hop is the same as the output matrix C of the previous hop, namely A^k+1＝C^kFor the same reason, the matrix T for time representation_A，T_CThe parameters are reduced in the same way.

In this embodiment, the multi-layer model further includes word similarity, and in the currently-viewed title q in the first layer, the keyword whose similarity in memory with that in q exceeds 0.8 is added to q by using the word similarity, so as to avoid that the title weight which is the same as or similar to that in q in memory but different from words is too low;

And the predicted result of the model is used as the nearest interest point of the user, and for each browsing title, the top 5 predicted interest points are selected according to the ranking. And taking the interest points as tags, recommending hot content corresponding to the tags, for example, if a predicted result tag comprises a friend, recommending the hot content with the friend tag.

In this embodiment, the evaluation criteria of the model are accuracy, recall, and F1 score.

It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. a community question and answer content recommendation method based on end-to-end memory network, is characterized in that, comprises the following steps:

S1: Obtain the title as a data set and preprocess the data set, and divide the data set into training set, validation set and test set;

S2: Build an end-to-end memory network model based on the dataset;

S3: Optimize the model using stochastic gradient descent (SGD) with AdaGrad update rules;

The end-to-end memory model includes a single-layer model and a multi-layer model; wherein the single-layer model includes a memory component, an input component and an output component;

The memory component represents: the title set D = {x ₁ , x ₂ ... x _n } that stores historical behaviors, and each word w _ij ∈ x _i is embedded in the d dimension using a matrix A of size dim×|V| In the memory vector {a _ij } of , such that a _ij =Aw _ij , the entire sentence set {x _i } is converted into a memory vector {a _i } with dimension d using matrix A;

The input component represents: the current browsing title q is converted into a vector b by the B matrix, and the matching degree between b and each memory a _i is calculated, the formula: p _i =Softmax(b ^T a _i ); where Softmax(z _i )= e ^Zi /∑ _j e ^Zj , p is the probability vector on the input;

The output component represents: the title set D = {x ₁ , x ₂ ... x _n } of historical behavior, converted to an output vector c _i of dimension d using a matrix C, and the output o is the weighted sum of the output vector c _i and the probability vector ,formula:

Final prediction f=Softmax(W(o+b));

The multi-layer model is that the title q of the input component is the sum of the input title b and the output o of the previous hop, that is, the input of the next layer k+1 is the sum of the output o ^k and the input b ^k from the layer k, the formula : b ^k ⁺¹ =ok +b ^k ;

where each layer has its own embedding matrix A ^k , C ^k for embedding the input {x _i };

The multi-layer model also includes sentence representation, for each sentence x _i = {x _i1 , x _i2 , ..., x _in }, embedding each word and summing the resulting vectors, and adding the temporal representation, the word vector is a 0-1 vector of length V such that a _i = ∑ _j Ax _ij + T _A (i); where T _A (i) is the ith row of the special matrix T _A that encodes time information; similarly, output Embedding matrix Tc, ci = ∑ _j Cx _ij + T _C (i), both T _A and T _C are learned during training;

The multi-layer model also includes word similarity. In the current browsing title q of the first layer, the word similarity is used to add keywords in memory with a similarity of more than 0.8 to q, so as to avoid the similarity between memory and q. Titles with the same or similar meaning but different words have too low weight;

In the corpus composed of all the preprocessed titles, select the keywords of the title being browsed, and perform pairwise similarity calculation with the remaining keywords. The calculation formula is:

where yi is the coefficient for w1 and w2 to start branching at layer i.

2 . The method for recommending community question and answer content based on an end-to-end memory network according to claim 1 , wherein the data set in step S1 is equally divided into a training set, a verification set and a test set. 3 .

3 . The method for recommending content in a community question and answer based on an end-to-end memory network according to claim 1 , wherein the title described in step S1 is the title of the content that the user is browsing and historical behavior in the community question and answer. 4 .

4. A community question-and-answer content recommendation method based on an end-to-end memory network according to claim 1, wherein the evaluation criteria of the model are precision, recall and F1 score.