CN118965037B

CN118965037B - Automatic training method for authoring model based on user behavior preference data

Info

Publication number: CN118965037B
Application number: CN202411450328.3A
Authority: CN
Inventors: 吴立军; 王晓龙; 赖永炫
Original assignee: Xiamen 20000 Li Culture Media Co ltd
Current assignee: Xiamen 20000 Li Culture Media Co ltd
Priority date: 2024-10-17
Filing date: 2024-10-17
Publication date: 2025-01-07
Anticipated expiration: 2044-10-17
Also published as: CN118965037A

Abstract

The invention provides an automatic training method of an authoring model based on user behavior preference data, and relates to the technical field of model training; the method comprises the steps of preprocessing a multidimensional initial behavior data set to obtain a behavior data set, carrying out time sequence analysis and situation association analysis on the behavior data set to obtain time sequence situation characteristic data, carrying out segmentation processing on the time sequence situation characteristic data according to time sequence to obtain a plurality of interest state fragments, carrying out clustering analysis on the interest state fragments to obtain a dynamic interest group data set of a user, constructing a dynamic interest model according to the dynamic interest group data set, monitoring newly-added behavior data of the user in real time, carrying out local updating and global optimization on the dynamic interest model, generating a training data set according to the dynamic interest model, and optimizing an creation model through the training data set to obtain the creation model. The invention improves the accuracy of the content recommended by the creation model.

Description

Automatic training method for authoring model based on user behavior preference data

Technical Field

The invention relates to the technical field of model training, in particular to an automatic model creating training method based on user behavior preference data.

Background

In the existing authoring model training method based on user behavior preference data, usually, the user's interest tags or preference portraits are extracted by analyzing the clicking, browsing, collecting and other behaviors of the user, and are used for training a content generation model. However, these methods may have a problem in that dynamic changes in user interests cannot be accurately captured. The current technology often adopts a static data processing mode, namely, preference labeling is carried out based on the behavior of the user in a period of time, so that timeliness and change trend of the user's interest can be ignored. For example, a user may show high frequency attention to a certain class of content for a certain period of time, but their interests gradually shift or change over time. If the model training fails to update the user behavior data in time, the generated content cannot reflect the latest interests of the user, and the relevance and accuracy of the recommendation result are reduced.

This static data processing approach is particularly prominent in high frequency content recommendation scenarios. For example, on short video platforms, users' content preferences often change rapidly under the influence of hot topics, news of newsletters. If the system cannot dynamically adjust the training data of the model to reflect the changes in time, the recommended content is disjointed from the current interests of the user, and the user experience is affected. Accordingly, the prior art has had much room for improvement in capturing dynamic changes in user preferences.

Disclosure of Invention

The invention aims to provide an automatic training method for an authoring model based on user behavior preference data, and aims to solve the problems in the background technology.

In order to solve the technical problems, the technical scheme of the invention is as follows:

An authoring model automated training method based on user behavior preference data, the method comprising:

Acquiring behavior data of a user, and combining context-related information when the data occur to obtain a multidimensional initial behavior data set;

carrying out multi-level preprocessing on the multi-dimensional initial behavior data set to obtain a behavior data set;

Performing time sequence analysis and context correlation analysis on the behavior data set, and extracting time sequence features and context features to obtain time sequence context feature data;

the time sequence situation characteristic data are segmented according to the time sequence, and each segment of data is mapped into multiple interest states of a user in a corresponding time period respectively, so that multiple interest state segments are obtained;

Performing multistage cluster analysis on the interest state fragments, and dividing the interest state fragments into a short-term interest group and a long-term interest group to obtain a dynamic interest group data set of a user;

Constructing a dynamic interest model of the user according to the dynamic interest group data set;

monitoring newly-added behavior data of a user in real time, and carrying out local updating and global optimization on a dynamic interest model, wherein the local updating is used for adjusting a short-term interest group, and the global optimization is used for adjusting a long-term interest group;

generating a training data set according to the dynamic interest model, wherein the training data set comprises short-term interest feature vectors, long-term interest feature vectors and situation weight vectors of the user;

and performing incremental training and transfer learning on the creation model through the training data set, and optimizing parameters of the model to obtain the creation model.

Preferably, the multi-level preprocessing is performed on the multi-dimensional initial behavior data set to obtain a behavior data set, including:

Screening click, browse and collection data in the multidimensional initial behavior data set according to the preset behavior frequency or preset duration to obtain a first preprocessing data set, and endowing time attenuation factors for data of different behavior types in the first preprocessing data set according to the context information ;

Performing multistage noise removal on the first preprocessed data to obtain second preprocessed data;

And weighting each type of behavior data in the second preprocessing data to obtain a weight coefficient, wherein the calculation formula of the weight coefficient is as follows:

,

Wherein, For the weight coefficient of the i-th behavior type,For the frequency or duration of the behavior of the i-th behavior type,For the priority coefficient of the i-th behavior type,As the regulatory factor for the ith behavior type,For a time interval, j is an index of behavior types, n is the total number of behavior types,For the time decay factor of the i-th behavior type,In order to smooth the parameters of the image,The bias term of the ith behavior type, e is the base of natural logarithm;

And carrying out normalization processing on the second preprocessing data according to the weight coefficient to obtain a behavior data set.

Preferably, performing time sequence analysis and context correlation analysis on the behavior data set, extracting time sequence features and context features, and obtaining time sequence context feature data, including:

Slicing the behavior data set according to the time window to obtain a plurality of time window segments;

the calculation formula of the time window segment length is as follows:

,

Wherein, Is the time difference between adjacent behavior data,For the amount of behavior data within the window,For the adjustment factor of the time window,For the indexing of the behavioural data,In order to adjust the parameters of the device,V is the index of the time difference, y is the number of time differences;

Counting the behavior data in each time window segment, and extracting time sequence characteristic values of click times, browsing duration and collection frequency;

And carrying out association analysis on the context information of each time window, extracting corresponding situation characteristics, and splicing the time sequence characteristic values and the situation characteristics to obtain time sequence situation characteristic vectors.

Preferably, the time sequence situation characteristic data is segmented according to time sequence, each segment of data is mapped into multiple interest states of the user in a corresponding time period, and a plurality of interest state segments are obtained, including:

Calculating time sequence situation characteristic vector of adjacent time window AndSimilarity betweenSimilarity degreeThe calculation formula of (2) is as follows:

,

Wherein, For the weight of the b-th dimension feature, m is the dimension of the feature vector, b is the dimension index of the feature vector,Is the firstThe feature value of a time window on the b-th dimensional feature,Is the firstFeature values of the time windows on the b-th dimensional feature;

when the similarity is When the threshold value is greater than the preset threshold value, by combining time sequence situation characteristic vectorsAndMerging the interest state fragments into merged interest state fragments, otherwise, marking the merged interest state fragments as different interest state fragments;

And carrying out smoothing treatment on the combined interest state fragments to obtain the interest state fragments.

Preferably, the multi-level clustering analysis is performed on the interest state segments, and the interest state segments are divided into short-term interest groups and long-term interest groups, so as to obtain a dynamic interest group data set of the user, which comprises the following steps:

Short-term interest similarity calculation is carried out on each interest state segment to obtain a similarity matrix Similarity matrixThe calculation formula of (2) is as follows:

,

Wherein, AndThe c-th interest state segment and the d-th interest state segment are respectively in the third segmentThe feature value, f, of the dimension is the feature dimension,As the weight of the feature of dimension e,For the regularization coefficient(s),Is a scaling parameter;

performing density clustering on the similarity matrix to obtain a short-term interest group;

and reclustering the central points of the short-term interest groups, combining the short-term interest groups into long-term interest groups, and obtaining a dynamic interest group data set.

Preferably, constructing a dynamic interest model of the user according to the dynamic interest group dataset comprises:

extracting features of the dynamic interest group data set to obtain a short-term interest feature data set and a long-term interest feature data set;

the short-term interest feature dataset is obtained by the following formula:

,

Wherein, As a short-term feature of interest vector,For the weight matrix of the input feature,In order to input the feature matrix,As a result of the offset vector,For adjusting the coefficients, s is the input feature matrixQ is the index of the attention header, h is the number of the attention header,Wherein, the method comprises the steps of, wherein,Is a matrix of queries that is a function of the query,In the form of a matrix of keys,In the form of a matrix of values,Is the dimension of the key and,The function is used to convert the dot product result into a probability distribution,As a ReLU or Swish function;

The long-term interest feature dataset is obtained by the following formula:

,

Wherein, For a long-term feature vector of interest,For index of attention heads, r is the number of attention heads,Representing the output stitching of multiple head self-attention mechanisms,,In order to output the transformation matrix,Is a weight parameter;

and carrying out weighted fusion on the short-term interest characteristic data set and the long-term interest characteristic data set to obtain the dynamic interest model.

Preferably, the method for monitoring the newly added behavior data of the user in real time, and carrying out local updating and global optimization on the dynamic interest model comprises the following steps:

The new added behavior data of the user is monitored in real time, the difference between the new added behavior data and the historical behavior data is calculated, and a differential vector is calculated ;

The short-term interest group is locally updated, and the long-term interest group is globally optimized, wherein the calculation formulas of the local update and the global optimization are as follows:

,

Wherein, For the first rate of learning to be the first,Is a differential vector, t is a time interval,Is the attenuation coefficient.

Preferably, generating the training data set according to the dynamic interest model comprises:

fusing the feature vectors in the short-term interest feature data set and the long-term interest feature data set to obtain a multidimensional feature data set;

The method comprises the steps of performing dimension reduction on a multidimensional feature data set to obtain the feature data set, wherein the dimension reduction calculation formula of the feature data set is as follows:

,

Wherein, In order to output the feature matrix,Is a matrix of units which is a matrix of units,In order for the parameters to be regularized,In the case of a matrix of the graph laplace,Is an input feature matrix;

And labeling the characteristic data set to generate a user behavior label, thereby obtaining a training data set.

Preferably, incremental training and migration learning are performed on the authoring model through a training data set, parameters of the model are optimized, and the authoring model is obtained, including:

acquiring a preliminary creation model, and adjusting the weight of an input layer of the preliminary creation model through a training data set to obtain an input layer optimization model;

optimizing hidden layer parameters of the input layer optimization model to obtain the hidden layer optimization model, wherein a calculation formula of hidden layer parameter optimization is as follows:

,

Wherein, In order to update the parameters of the data,As the original parameters of the method, the method can be used for generating the data,For the fourth rate of learning to be the fourth rate,Is a first objective function;

freezing the feature extraction layer of the hidden layer optimization model to obtain a frozen feature model;

thawing a feature extraction layer of the frozen feature model meeting preset conditions, and fine-adjusting parameters of the feature extraction layer to obtain a fine-adjusted feature model, wherein a calculation formula of fine-adjusting parameters of the feature extraction layer is as follows:

,

Wherein, To extract the new parameters of the layer for the feature of thawing,To extract the parameters of the layer for the characteristics of thawing,For the second rate of learning to be the second rate,Is a second objective function;

carrying out fusion treatment on the unfreezing layer and the freezing layer of the fine tuning feature model to obtain a fusion model;

and performing global adjustment by weighting the tasks of the fusion model to obtain the creation model.

Preferably, the training data set is used for adjusting the weight of the input layer of the preliminary creation model to obtain an input layer optimization model, which comprises the following steps:

Acquiring an input layer weight matrix of a preliminary creation model And calculate a loss function of the training data set;

Updating the weight of the input layer according to the gradient information through a back propagation algorithm to obtain an input layer adjustment weight matrix, wherein the calculation formula of the input layer adjustment weight matrix is as follows:

,

Wherein, The weight matrix is adjusted for the input layer,E is a loss function for the third learning rate;

Adjusting weight matrix using input layer And reevaluating the training data set, and if the loss function does not reach the preset convergence condition, repeating the step of updating the input layer weight until the convergence condition is met, so as to obtain the input layer optimization model.

The scheme of the invention at least comprises the following beneficial effects:

Firstly, the invention constructs a complete initial behavior data set by acquiring multidimensional behavior data of a user and combining context information. The traditional method only processes behavior data with a single dimension, ignores situation information of user behaviors, and causes larger deviation of model prediction results. By context association, the invention can more comprehensively capture the behavior motivation and interest change of the user, and improves the accuracy of model prediction.

Secondly, through multi-level preprocessing steps, the method not only can clean noise in the data, but also improves the representativeness of the data through dynamic weighting and normalization processing. Compared with the traditional model which only depends on a simple screening mechanism, the method can give different weights to the behavior data according to factors such as the behavior frequency, duration time, priority and the like, and ensure reasonable distribution of the influence of the data along with the time by combining time attenuation factors. The dynamic adjustment mode avoids the defect of a static weight mechanism, so that the model can still maintain accurate prediction capability in a scene with rapid user behavior change.

In addition, the invention extracts the time sequence characteristics and the situation characteristics of the user behaviors through time sequence analysis and situation association analysis, thereby being capable of accurately capturing the behavior pattern changes of the user. The traditional model is often insufficient in capturing the dynamic interests of the user, and the behavior change trend of the user is difficult to reflect in real time. By processing the time sequence data in a segmentation way, the invention can identify multiple interest states of the user according to the behavior characteristics of the user in different time periods and form a plurality of interest state fragments. This mechanism ensures that the authoring model can adjust the model in time when the interests of the user change, and does not rely on static interest state modeling any more.

The invention further introduces the division of short-term interest groups and long-term interest groups, so that the authoring model can simultaneously process short-term interest fluctuation and long-term attention points of users. The dynamic interest model can be locally updated according to the new behavior of the user so as to respond to the instant requirement of the user, meanwhile, the stability of the long-term interest group is kept, and the accuracy of long-term interest prediction is ensured through global optimization.

Finally, the invention optimizes a plurality of sub-modules of the creation model through incremental training and transfer learning. The incremental training can effectively avoid resource waste of retraining the whole model, and is particularly suitable for large-scale user data scenes. The introduction of transfer learning enables the creation model to be quickly adapted to new tasks or new data, and the generalization capability and expansibility of the creation model are further improved. By optimizing the parameters of the creation model in real time, the method can always keep high consistency with the interests of the user, and individuation and accuracy of the recommendation result are ensured.

Drawings

FIG. 1 is a flow diagram of an authoring model automation training method based on user behavior preference data provided by an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As shown in fig. 1, an embodiment of the present invention proposes an authoring model automatic training method based on user behavior preference data, the method comprising the steps of:

In an embodiment of the invention, the method comprises the steps of acquiring user behavior data and generating a multi-dimensional initial behavior data set by combining context-related information. By performing multi-level preprocessing on the multi-dimensional data set, non-representative low-frequency behavior or noise data can be effectively removed, and a more accurate behavior data set is formed. Next, the dataset extracts time-series features and contextual features of the user behavior through a temporal analysis and a contextual correlation analysis. By the method, the dynamic change of the user behavior and the context environment of the user behavior can be deeply understood, so that the prediction of the user interest by the authoring model is more accurate.

The time sequence situation characteristic data are segmented according to time sequence and mapped into a plurality of interest state segments. This process provides a basis for subsequent interest state modeling by distinguishing user interest changes over different time periods. The plurality of interest state segments are further analyzed by multi-level clustering to form short-term and long-term interest groups. The short-term interest group reflects the user's immediate interest preferences, while the long-term interest group captures the user's long-term points of interest. The user dynamic interest model constructed by the method can update interest preferences of users in real time, and perform global optimization while locally adjusting short-term interest groups so as to ensure the stability of long-term interest groups.

Compared with the prior art, the method can monitor the behavior change of the user in real time and quickly adjust the model so as to improve the recommendation effect of the personalized content of the user. The method can adapt to complex and changeable user behaviors, and avoids the limitation of the traditional static model. For example, when a user suddenly shows a strong interest in a certain class of content in a short period, the recommendation strategy can be timely adjusted through a local update mechanism, meanwhile, the stability of a long-term interest group of the user is reserved, and the long-term prediction effect of the overall creation model is not affected. The dynamic updating mechanism greatly improves the response speed of the authoring model to the user behavior, and is particularly suitable for scenes of large-scale user data, such as electronic commerce and content recommendation systems.

In a preferred embodiment of the present invention, the multi-level preprocessing is performed on the multi-dimensional initial behavior data set to obtain a behavior data set, including:

Performing multistage noise removal on the first preprocessing data, firstly removing low-variance behavior data through a variance screening method, and then removing abnormal data points through a bilateral filtering algorithm to obtain second preprocessing data;

,

In the embodiment of the invention, when the multidimensional initial behavior data set is processed, click, browse and collection data are firstly screened through preset behavior frequency or duration, and non-representative low-frequency behavior data are removed. The method effectively reduces the interference of noise on the training of the creation model, and ensures that the data processed later is more representative. The screened data is also combined with context information to give a time attenuation factor, so that dynamic weighting processing of the user behavior data is realized.

The multistage noise removal step adopts a variance screening and bilateral filtering algorithm, and can effectively remove low-variance behavior data and abnormal data points. The process enhances the accuracy and reliability of the data, and particularly for behavior data with stronger noise interference, the training effect of the created model can be remarkably improved. After noise removal, the behavior data are weighted according to the frequency and the priority of the behavior data to obtain a weight coefficient, and the calculation formula of the weight coefficient is further dynamically adjusted through a time attenuation factor to ensure that the weight coefficient is matched with the actual influence of the behavior data.

Compared with the prior art, the invention can ensure the quality and the representativeness of the data through more complex and fine multilevel preprocessing process. For example, when a user browses a large amount of irrelevant contents, the preprocessing process can identify noise data and remove the noise data in time, so that the interference of invalid data on the authoring model is avoided, and the subsequent authoring model training is more accurate and efficient.

Wherein the data of different behavior types in the first preprocessed data set are given time decay factors based on the context information, with the aim of reflecting the change in importance of the user behavior over time by introducing a time decay mechanism.

The time decay factor is a parameter used to describe how the importance of a behavior changes over time. It is often used to reduce the impact of past behavior, making the model more focused on recent behavior. By assigning time attenuation factors to different types of behaviors, the weight of each behavior can be dynamically adjusted to reflect the influence degree of the behavior at different time points.

The context information includes background information of time, place, device, user status, etc. when the user behavior occurs. Such information may help to better determine the importance of a certain behavior. For example, a user's behavior during a particular period of time (e.g., night viewing content) may have a different decay law than the behavior during the day.

Depending on the context information, the temporal decay factor may be adjusted for a particular scene. For example, the behavior during the high frequency access period may be given a smaller time decay factor to keep its weight high.

According to the interval between the time point of behavior and the current time, the attenuation factor is calculated through an exponential function, and the attenuation can be performed by adopting a linear function, so that the behavior weight is linearly reduced along with the time.

Different types of user behavior (e.g., clicking, browsing, collecting, etc.) affect the model to different extents. For example, the collection behavior may reflect the depth of interest of the user more than the click behavior, and thus may be given a smaller decay factor.

For different users or situations, the attenuation factors of different behavior types can be adjusted in a personalized mode according to the behavior habits of the users, so that the capturing capability of the model on the user preference is improved.

In summary, the time attenuation factor is given to the data of different behavior types according to the context, so that the model can adaptively adjust the weight of the behavior data, thereby more accurately reflecting the interest variation trend of the user.

In a preferred embodiment of the present invention, performing time sequence analysis and context correlation analysis on a behavior data set, extracting time sequence features and context features, and obtaining time sequence context feature data, including:

the calculation formula of the time window segment length is as follows:

,

In the embodiment of the invention, when the time sequence analysis is carried out on the behavior data set, a time window slicing method is adopted, and the user behavior data is dynamically divided through the set time window length. The length of the time window is calculated according to the time difference of the user behaviors, so that the window length can be adaptively adjusted under different scenes to adapt to the diversity of the behaviors. The behavior data in each time window segment is extracted to obtain the characteristics of clicking times, browsing duration, collection frequency and the like, and the associated analysis is carried out by combining the corresponding context to form a time sequence context characteristic vector.

By extracting the time sequence features of the user behavior, the time-varying trend of the behavior can be captured. By combining the context, not only can the behavior of the user at a certain moment be understood, but also the motivation behind the behavior can be analyzed. This makes the prediction of user behavior by the authoring model more intelligent and accurate. For example, users frequently browse certain types of content at specific times and places, and the potential needs of the users can be predicted by combining time and geographic information, so that more targeted recommendation services are provided.

Compared with the prior art, the method adds consideration of the situation information when processing the behavior data, and improves the analysis capability of complex behavior patterns. For complex application scenes such as an e-commerce platform or social media, customized services can be provided according to the behaviors and the situations of the users, and user experience is further improved.

Performing association analysis on the context information of each time window, extracting corresponding situation features, and splicing the time sequence feature values and the situation features to obtain time sequence situation feature vectors, wherein the method comprises the following steps:

In the time series analysis, the context information may include additional information related to the environment, device, location, time, etc. when the user behavior occurs. The correlation analysis is to relate this contextual information to the user's behavioral data to better understand the context of the user's behavior. For example, the user's browsing behavior may have different intentions at different time periods (e.g., work time versus rest time) or at different locations (home, office), and such contextual information may be used to adjust the predictions of the model.

Based on the association analysis, features related to the user behavior are extracted from the context information. For example, the time information may be processed into characteristics of hours, days of the week, etc., the location information may be converted into category characteristics, and the device type may be added as a characteristic to the model. These features reflect the context in which the behavior occurs.

The time sequence features (such as clicking times, browsing time lengths and the like) extracted from the behavior data are combined with the situation features to form a more comprehensive feature vector. This stitching process combines the temporal and contextual features into a temporal context feature vector representing the user's behavior under a particular time window and its background information.

In a preferred embodiment of the present invention, the processing of segmenting the time-series contextual characteristic data according to a time sequence, and mapping each segment of data into multiple interest states of the user in a corresponding time period, to obtain multiple interest state segments, includes:

,

In the embodiment of the invention, after the time sequence situation characteristic data is acquired, the time sequence situation characteristic data is segmented according to time sequence, and the data is mapped into multiple interest states of the user. Through similarity calculation between adjacent time windows, the interest change trend of the user in the adjacent time periods can be accurately judged, and similar time segments are combined, so that unnecessary data splitting is reduced.

The computation of similarity combines multidimensional feature weights so that the authoring model can comprehensively evaluate the interest states in multiple dimensions. In order to prevent data from excessively splitting, a smoothing processing method is introduced, and the continuity and stability of the interesting state fragments are further ensured. The smoothing process avoids excessive adjustment caused by behavior fluctuation in a short time, and ensures smoother generation of the interest segments.

Compared with the prior art, the method and the device can reduce data redundancy and simultaneously maintain sensitivity to user interest changes. For example, the user may alternately show interests to various contents in a period of time, and correlation of the interests can be identified through similarity calculation, so that the interest state of the user is prevented from being excessively segmented, and consistency of recommended contents is ensured.

In a preferred embodiment of the present invention, the multi-level clustering analysis is performed on the interest state segments, and the interest state segments are divided into short-term interest groups and long-term interest groups, so as to obtain a dynamic interest group data set of the user, which includes:

,

In the embodiment of the invention, the interest state segments of the user are divided into a short-term interest group and a long-term interest group through multistage cluster analysis. The cluster analysis first performs density clustering on the interest states of the user in a short period of time so as to capture the instant demands of the user. Then, the long-term interest group is further formed through reclustering the center points of the short-term interest group, and the long-term interest group is used for describing the long-term interest trend of the user.

The multi-level clustering ensures that the authoring model can capture both short-term behavior changes and long-term interest directions of the user. This hierarchical approach is particularly useful in authoring models. For example, when a user shows an interest in a certain type of commodity in a short period, it can be quickly identified and responded to, while at the same time, tracking of the field of long-term interest of the user is not lost due to short-term variations.

Through hierarchical clustering, short-term and long-term interest capturing is more comprehensive, dependence on a single interest group in the prior art is avoided, and accuracy and durability of recommendation are improved.

The density clustering of the similarity matrix is carried out to obtain a short-term interest group, which comprises the following steps:

The similarity matrix represents similarity values between the segments of interest states, and each matrix element corresponds to similarity between two segments. For example, elements in a matrix The similarity of the c-th and d-th fragments may be expressed.

Density clustering is a clustering method based on the density of data points for finding areas of high density in a similarity matrix (i.e., clusters where segments of interest are more similar). For example, common density clustering algorithms are DBSCAN and OPTICS. Similar fragments are clustered together by looking for regions of higher density.

After the clustering is completed, the fragments with higher similarity are distributed into the same short-term interest group, which means that the interest states of the users have higher similarity in a shorter time range. The division of short-term interest groups helps to further analyze the short-term interest trend of the user.

In a preferred embodiment of the present invention, constructing a dynamic interest model of a user from a dynamic interest group dataset includes:

the short-term interest feature dataset is obtained by the following formula:

,

The long-term interest feature dataset is obtained by the following formula:

,

In the embodiment of the invention, the dynamic interest model of the user is constructed through extracting the characteristics of the dynamic interest group data set. Short-term interest feature vectors are extracted through calculation of a weight matrix, and long-term interest features are generated through splicing processing of a multi-head self-attention mechanism. And the two interest feature vectors are subjected to weighted fusion to generate a final dynamic interest model.

The model enables authoring models to respond to the user's most recent needs in a short period of time by balancing short-term and long-term interests, while maintaining a long-term grasp of the user's overall interests. Particularly, due to the introduction of a multi-head self-attention mechanism, the extraction of the long-term interest features is more flexible, and the method can adapt to the complex long-term behavior mode of a user. For example, in a news recommendation system, a user may show short-term interests for a certain hot event while maintaining long-term interest in the science and technology field, and accurate content recommendation can be achieved through balanced processing of short-term and long-term interests.

Wherein the matrix of queries, keys and values is an input feature matrixObtained by linear transformation with respective weight matrices whose dimensions depend on the dimensions of the input features and on the target dimensions of the query, key and value. For example, input feature matrixIs of the dimension ofThe target dimensions of query, key and value areThe corresponding weight matrix size is。

The method for obtaining the dynamic interest model comprises the following steps of:

The short-term interest feature and the long-term interest feature may exhibit different effects on different time scales. By giving different weights to the short-term and long-term interest features, their degree of contribution to the final dynamic interest model can be flexibly adjusted. For example, the weight of the short-term interest feature may be increased when the user's recent behavior changes significantly, and the weight of the long-term interest feature may be increased when the user's long-term interest is stable.

The calculation formula of the weighted fusion is as follows:

,

Wherein:

is a short term interest feature vector.

Is a long-term interest feature vector.

Is a weighting factor ranging from 0 to 1 for controlling the importance ratio of the short-term and long-term features.

When (when)When larger, the short-term interest feature is more influential, and the dynamic interest model is more likely to reflect the current interest of the user.

When (when)Smaller, the long-term interest feature is more influential and the dynamic interest model is more prone to describe the long-term preferences of the user.

According to the invention, through the construction of the dynamic interest model, the adaptability and the accuracy of the created model are obviously improved, and compared with the prior art, the dynamic interest change of a user can be better processed.

In a preferred embodiment of the present invention, the method for monitoring the newly added behavior data of the user in real time, and performing local updating and global optimization on the dynamic interest model includes:

,

In the embodiment of the invention, when the newly-added behavior data of the user is monitored in real time, the dynamic interest model can be adjusted according to the newly-added behavior of the user. Specifically, the short-term interest group is locally updated by computing a differential vector of the user behavior data, while the long-term interest group is adjusted by global optimization. The combination of local updating and global optimization enables the authoring model to rapidly respond to the instant behavior change of the user and to maintain the prediction of long-term interests of the user.

This hierarchical updating approach enables reduced resource consumption for retraining authoring models in the face of large-scale user behavior data. Compared with the existing global updating method, the method can dynamically adjust the short-term interest group on the premise of not influencing long-term prediction. For example, when a user frequently browses a certain type of commodity in a short period, the recommendation strategy can be updated immediately, and the long-term interest group is kept unchanged, so that the accuracy of long-term recommendation is ensured.

In a preferred embodiment of the present invention, generating a training data set from a dynamic interest model includes:

,

In the embodiment of the invention, the latest training data set is generated, the short-term interest feature vector and the long-term interest feature vector are combined, the multidimensional feature data set is generated, and the data is subjected to dimension reduction processing. Through the dimension reduction processing step, the data dimension can be reduced, the data redundancy is reduced, and the training efficiency is improved.

Compared with the prior art, the dimension reduction processing ensures that the created model can keep key information in the training process, and meanwhile reduces the influence of redundant data on the model training time. For example, in a scenario where the user behavior data is large in scale, the dimension reduction process can accelerate model training and ensure accuracy and stability of training results.

The method for labeling the feature data set to generate a user behavior label and obtain a training data set comprises the following steps:

And allocating labels for the characteristic data sets according to the historical behaviors or business rules of the users. These labels are typically used for model training for supervised learning. For example, if the user's interest level in a certain content is to be predicted, the actual click behavior of the user may be regarded as a label.

The user behavior tags may be in various forms, such as a two-class tag (of interest or not), a multi-class tag (of different interest classes), or a continuous value (user score or frequency of behavior). The labeling process can be automatically generated through an algorithm, and can also be enhanced by combining manually labeled data.

By combining the feature data set with the corresponding user behavior label, a complete training data set is formed. This training dataset is used to train the model so that it can predict or classify future user behavior.

In a preferred embodiment of the present invention, incremental training and migration learning are performed on an authoring model through a training data set, parameters of the model are optimized, and an authoring model is obtained, including:

,

In the embodiment of the invention, a plurality of sub-modules of the preliminary authoring model are optimized through incremental training and transfer learning. First, by adjusting the input layer weights and hidden layer parameters, the model can be optimized step by step. The partial freezing and thawing of the feature extraction layer further ensures that global adaptation of the model can be done efficiently with limited computational resources.

Compared with the traditional global retraining method, the method can adapt to new data and tasks while retaining the original creation model capability through incremental training. The application of the transfer learning further improves the adaptability of the authoring model, so that the authoring model can rapidly cope with new content generation tasks.

Wherein the first objective functionAnd a second objective functionDepending on the type of task and the actual application requirements of the model.

First objective functionFor measuring the error between the model output and the target value when optimizing the input layer weights and hidden layer parameters. Are typically used to adjust hidden layer parameters to make the middle layer behavior of the model more predictable.

If the model task is a regression problem, the first objective function may use the mean square error to calculate the error between the predicted value and the true value.

If the model task is classification arbitrary, cross entropy loss is used as an objective function for calculating the difference between the predictive probability distribution and the true labels.

Second objective functionFor optimizing parameters in fine tuning the thawing layer to improve the adaptation of the model to specific features. In fine tuning the feature extraction layer, a second objective function is used to measure the error of the output of the defrost layer from the desired output.

Second objective functionA matching loss in feature space, such as a laplace loss, may be employed to measure the difference between the distribution of the trim layer output features and the target distribution.

Second objective functionDuring the trimming process, regularization terms, such as L2 regularization, may also be added to avoid overfitting.

The preset conditions in thawing the feature extraction layer of the frozen feature model satisfying the preset conditions may be determined according to a specific policy or condition. For example:

Layer-by-layer thawing strategy:

The feature extraction layer is progressively defreezed from high to low (or low to high) in terms of its hierarchy. For example, higher-level feature extraction layers near the output layer may be thawed first, as these layers tend to learn more specific task features, with greater flexibility. The lower feature extraction layers closer to the input layer are then progressively thawed.

The thawing order may be based on the position of the layer or its distance from the output layer.

Gradient change strategy:

The thawed layers are determined by monitoring the gradient change of each layer. If the gradients of some layers are large, indicating that these layers are poorly adapted to the target task, this part of the layers may be defrosted preferentially for fine-tuning optimization.

This strategy is applicable in situations where a dynamic determination of the defrost layer is required.

Specific threshold strategy:

the range of the thawing layer is determined according to the performance index of the model (such as the value of the loss function) and a specific threshold. If the optimizing effect of the model's loss function on a particular level reaches a set threshold, then that level or levels are thawed.

The threshold can be dynamically adjusted according to the optimized progress to improve training efficiency.

Importance measurement strategy:

The thawing order is determined by calculating the impact of each layer on the overall model, such as weight importance or feature contribution. And (3) thawing the layer with larger influence on the model output preferentially so as to accelerate the adaptation of the model to a new task.

Such policies may measure the importance of the layers through regularization techniques or feature-based methods.

In a preferred embodiment of the present invention, the training data set is used to adjust the weight of the input layer of the preliminary creation model, so as to obtain an input layer optimization model, which includes:

,

In the embodiment of the invention, the input layer weight of the preliminary authoring model is optimized through a back propagation algorithm. And gradually updating the weight of the input layer according to gradient information of the loss function by acquiring a preliminary input layer weight matrix, and finally generating an optimized input layer weight matrix.

The optimized input layer can effectively improve the feature extraction capability of the model, so that the subsequent model creation training is more efficient. Compared with the existing input layer optimization method, the method can reduce the influence of invalid data on the creation model and improve the overall prediction capability by accurately adjusting the weight of the input layer when processing large-scale data.

Wherein, the loss function E is used for measuring the adjustment effect of the input layer according to the task type and the actual application requirement of the model so as to improve the overall performance of the model, and can adopt mean square error or cross entropy loss. A combined loss function may also be used to optimize multiple tasks simultaneously.

While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims

1. An automated training method for an authoring model based on user behavior preference data, the method comprising:

Acquiring behavior data of a user, and combining context information when the data occur to obtain a multidimensional initial behavior data set;

Performing incremental training and transfer learning on the creation model through the training data set, and optimizing parameters of the model to obtain the creation model;

the method for monitoring the newly added behavior data of the user in real time carries out local updating and global optimization on the dynamic interest model, and comprises the following steps:

The method comprises the steps of carrying out local updating on a short-term interest group and carrying out global optimization on a long-term interest group, wherein the calculation formula of the local updating is as follows:

,

Wherein, For the updated short-term interest feature vector,To update the short-term interest feature vector before updating,For the first rate of learning to be the first,Is a differential vector, t is a time interval,Is the attenuation coefficient.

2. The automated training method of authoring models based on user behavior preference data of claim 1, wherein performing multi-level preprocessing on a multi-dimensional initial behavior data set to obtain the behavior data set comprises:

,

3. The automated training method of authoring models based on user behavior preference data of claim 2, wherein performing time series analysis and context correlation analysis on the behavior data set, extracting time series features and context features, and obtaining time series context feature data, comprises:

the calculation formula of the time window segment length is as follows:

,

4. The automated training method of authoring model based on user behavior preference data according to claim 3, wherein the step of processing the time sequence contextual characteristic data in a segmented manner according to time sequence, and mapping each segment of data into multiple interest states of the user in a corresponding time period to obtain multiple interest state segments, comprises:

,

5. The automated training method of authoring models based on user behavior preference data of claim 4 wherein the multi-level clustering of interest state segments into short-term interest groups and long-term interest groups to obtain a user's dynamic interest group dataset comprises:

,

6. The automated training method of authoring models based on user behavior preference data of claim 5 wherein constructing a dynamic interest model for a user from a dynamic interest group dataset comprises:

the short-term interest feature dataset is obtained by the following formula:

,

The long-term interest feature dataset is obtained by the following formula:

,

7. The automated training method of authoring models based on user behavior preference data of claim 6 wherein generating a training data set based on a dynamic interest model comprises:

,

8. The automated training method of the authoring model based on the user behavior preference data of claim 7, wherein the incremental training and the transfer learning are performed on the authoring model through the training data set, and parameters of the model are optimized to obtain the authoring model, and the method comprises the following steps:

,

9. The automated training method of authoring models based on user behavior preference data of claim 8 wherein adjusting weights of input layers of a preliminary authoring model by training data sets to obtain an input layer optimization model comprises:

,