Disclosure of Invention
The embodiment of the application mainly aims to provide an information prediction method, device, electronic equipment and medium based on a multi-task model, which increase flexibility of information prediction and improve relevance between a user and an article.
To achieve the above object, a first aspect of an embodiment of the present application provides an information prediction method based on a multitasking model, where the multitasking model includes a sharing layer, the method includes:
acquiring first historical behavior data of users in a first group and second historical behavior data of users in a second group, wherein the types of the users in the first group and the second group are different;
inputting the first historical behavior data and the second historical behavior data into the multi-task model for weight calculation to obtain a first attention vector and a second attention vector;
Inputting the first attention vector, the second attention vector, the first historical behavior data and the second historical behavior data into the sharing layer for vector sharing, and outputting a sharing vector and feature sharing information;
performing differential learning on the shared vector and the characteristic shared information based on a preset gate mechanism to obtain an embedded vector;
performing feature learning on the multi-task model according to the embedded vector, the first historical behavior data and the second historical behavior data to obtain a trained multi-task model;
acquiring first click information of users in the first group and second click information of users in the second group;
And inputting the first click information and the second click information into a trained multitasking model to perform click prediction, so as to obtain first prediction information corresponding to the first click information and second prediction information corresponding to the second click information.
In some embodiments, the multi-task model includes an attention structure, and the inputting the first historical behavior data and the second historical behavior data into the multi-task model for weight calculation to obtain a first attention vector and a second attention vector includes:
acquiring first preference characteristics of the first historical behavior data and second preference characteristics of the second historical behavior data;
performing feature conversion on the first preference feature and the second preference feature to obtain a first feature vector and a second feature vector;
and inputting the first feature vector and the second feature vector into the attention structure to perform weight calculation, so as to obtain a first attention vector and a second attention vector.
In some embodiments, the inputting the first attention vector, the second attention vector, the first historical behavior data, and the second historical behavior data into the sharing layer for vector sharing, outputting a sharing vector and feature sharing information, includes:
Inputting the first attention vector and the second attention vector into the sharing layer for vector splicing to obtain the sharing vector;
and inputting the first historical behavior data and the second historical behavior data into the sharing layer to perform feature sharing, so as to obtain the sharing information.
In some embodiments, the sharing layer includes a location biasing network, and the inputting the first historical behavior data and the second historical behavior data into the sharing layer for feature sharing, to obtain the sharing information, includes:
Determining first bias information corresponding to the first historical behavior data and second bias information corresponding to the second historical behavior data;
inputting the first deflection information and the second deflection information into the position deflection network to perform position learning, so as to obtain a first position vector and a second position vector;
performing feature stitching on the first position vector and the first historical behavior data to obtain first shared information;
performing feature stitching on the second position vector and the second historical behavior data to obtain second shared information;
And generating shared information according to the first shared information and the second shared information.
In some embodiments, the multi-task model includes a ranking model, and the feature learning is performed on the multi-task model according to the embedded vector, the first historical behavior data and the second historical behavior data to obtain a trained multi-task model, including:
inputting the embedded vector, the first historical behavior data and the second historical behavior data into the sequencing model for back propagation to obtain a first error value and a second error value;
and reversely optimizing the embedded vector according to the first error value and the second error value to obtain a trained multi-task model.
In some embodiments, the inputting the embedded vector, the first historical behavior data, and the second historical behavior data into a preset sorting model for back propagation to obtain a first error value and a second error value includes:
Inputting the embedded vector, the first historical behavior data and the second historical behavior data into a preset sequencing model, so that the sequencing model carries out nonlinear learning on the first historical behavior data and the embedded vector to obtain a first error value, and carries out nonlinear learning on the second historical behavior data and the embedded vector to obtain a second error value.
In some embodiments, the inputting the first click information and the second click information into the trained multitasking model to perform click prediction, to obtain first prediction information corresponding to the first click information and second prediction information corresponding to the second click information, includes:
Inputting the first click information and the second click information into a trained multitasking model, so that the multitasking model scores the first click information and the second click information to obtain a first prediction probability and a second prediction probability;
And determining first prediction information corresponding to the first click information according to the first prediction probability, and determining second prediction information corresponding to the second click information according to the second prediction probability.
To achieve the above object, a second aspect of the embodiments of the present application provides an information prediction apparatus based on a multi-task model, the multi-task model including a sharing layer, the apparatus including:
The system comprises a historical behavior acquisition module, a first user identification module and a second user identification module, wherein the historical behavior acquisition module is used for acquiring first historical behavior data of users in a first group and second historical behavior data of users in a second group, and the types of the users in the first group and the second group are different;
the weight calculation module is used for inputting the first historical behavior data and the second historical behavior data into the multi-task model to perform weight calculation so as to obtain a first attention vector and a second attention vector;
The vector sharing module is used for inputting the first attention vector, the second attention vector, the first historical behavior data and the second historical behavior data into the sharing layer to carry out vector sharing and outputting a sharing vector and feature sharing information;
The difference learning module is used for carrying out difference learning on the shared vector and the characteristic shared information based on a preset gate mechanism to obtain an embedded vector;
the feature learning module is used for carrying out feature learning on the multi-task model according to the embedded vector, the first historical behavior data and the second historical behavior data to obtain a trained multi-task model;
The click information module is used for acquiring first click information of the users in the first group and second click information of the users in the second group;
And the probability prediction module is used for inputting the first click information and the second click information into the trained multitask model to perform click prediction so as to obtain first prediction information corresponding to the first click information and second prediction information corresponding to the second click information.
To achieve the above object, a third aspect of the embodiments of the present application provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the information prediction method based on the multitasking model according to the first aspect when executing the computer program.
To achieve the above object, a fourth aspect of the embodiments of the present application proposes a computer-readable storage medium storing a computer program which, when executed by a processor, implements the information prediction method based on a multitasking model as described in the first aspect.
According to the information prediction method, the device, the electronic equipment and the storage medium based on the multi-task model, first, the first historical behavior data of the user in the first group and the second historical behavior data of the user in the second group are obtained, so that data information of different user types can be obtained, the first historical behavior data and the second historical behavior data are input into a sharing layer to carry out vector sharing, the sharing vector and the characteristic sharing information are output, the interest of the user is focused, the situation that the information sparsity among the groups is large is avoided, then, differential learning is carried out on the sharing vector and the characteristic sharing information based on a preset gate mechanism to obtain embedded vectors, the difference among different groups can be learned, the effect among different groups is improved, then, feature learning is carried out on the multi-task model according to the embedded vectors, the first historical behavior data and the second historical behavior data, the trained multi-task model is obtained, the generalization performance and the association performance of the multi-task model are improved, finally, the first click information of the user in the first group and the second click information of the user in the second group and the second click information of the second group are obtained, the corresponding click information and the first click information and the second click information of the user in the first group are correspondingly predicted, and the click information of the predicted is improved, and the click information is predicted.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
First, several nouns involved in the present application are parsed:
Natural language processing (natural language processing, NLP) NLP is a branch of artificial intelligence, which is a interdisciplinary of computer science and linguistics, and is often referred to as computational linguistics, where NLP is processed, understood, and applied in human language (e.g., chinese, english, etc.). Natural language processing includes parsing, semantic analysis, chapter understanding, and the like. Natural language processing is commonly used in the technical fields of machine translation, handwriting and print character recognition, voice recognition and text-to-speech conversion, information intent recognition, information extraction and filtering, text classification and clustering, public opinion analysis and opinion mining, and the like, and relates to data mining, machine learning, knowledge acquisition, knowledge engineering, artificial intelligence research, linguistic research related to language calculation, and the like.
A multitasking model (Modeling Task Relationships in Multi-TASK LEARNING WITH Multi-gate mix-of-expertise, MMOE) MMOE models relationships between tasks, learning task specificity versus shared representation trade-offs. It allows for automatic allocation of parameters to learn information shared between tasks, or information unique to a task. MMOE are easy to train and converge to a better penalty in several rounds.
Multilayer perceptron (Multilayer Perceptron, MLP) the multilayer perceptron is a feed-forward artificial neural network model that maps multiple datasets of inputs onto a single dataset of outputs. The multilayer perceptron is also called an artificial neural network (ARTIFICIAL NEURAL NETWORK, ANN), which may have multiple hidden layers in the middle, except for the input and output layers, the simplest MLP contains only one hidden layer, i.e., a three-layer structure.
Sigmoid function (Sigmoid function) is a Sigmoid function common in biology, also known as an S-shaped growth curve. In information science, sigmoid functions are often used as activation functions for neural networks, mapping variables between 0,1, due to their single increment and anti-function single increment properties.
Based on the information, the embodiment of the application provides an information prediction method and device based on a multi-task model, electronic equipment and a storage medium, which can increase flexibility of information prediction and improve relevance between a user and an article.
The information prediction method and device based on the multi-task model, the electronic device and the storage medium provided by the embodiment of the application are specifically described through the following embodiments, and the information prediction method based on the multi-task model in the embodiment of the application is described first.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligent software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a module management technology of an online guest receiving system, a natural language processing technology, machine learning/deep learning and other directions.
The embodiment of the application provides an information prediction method based on a multi-task model, and relates to the technical field of data processing. The information prediction method based on the multi-task model provided by the embodiment of the application can be applied to a terminal, a server and software running in the terminal or the server. In some embodiments, the terminal may be a smart phone, a tablet computer, a notebook computer, a desktop computer, etc., the server may be configured as an independent physical server, may be configured as a server cluster or a distributed system formed by a plurality of physical servers, and may be configured as a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligent platforms, and the software may be an application for implementing an information prediction method based on a multitasking model, but is not limited to the above form.
The application is operational with numerous general purpose or special purpose computer system environments or configurations. Such as a personal computer, a server computer, a hand-held or portable device, a tablet device, a multiprocessor system, a microprocessor-based system, a set top box, a programmable consumer electronics, a network PC, a minicomputer, a mainframe computer, a distributed computing environment that includes any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In the embodiments of the present application, when related processing is performed according to user information, user behavior data, user history data, user location information, and other data related to user identity or characteristics, permission or consent of the user is obtained first, and the collection, use, processing, and the like of the data comply with related laws and regulations and standards of related countries and regions. In addition, when the embodiment of the application needs to acquire the sensitive personal information of the user, the independent permission or independent consent of the user is acquired through popup or jump to a confirmation page and the like, and after the independent permission or independent consent of the user is definitely acquired, the necessary relevant data of the user for enabling the embodiment of the application to normally operate is acquired.
The financial market is very complex, and different factors can influence the price and the yield of financial products, so that financial institutions need to effectively manage risks so that investors can quickly make decisions according to needs, and therefore, the financial institutions can usually adopt information recommending means to prompt risks and recommend products.
In an information recommendation scene, in order to better improve information recommendation platform benefits, information delivery party benefits and user experience, a requirement of pertinently recommending information to a user exists. For example, in the process of recommending financial products to a user, the financial recommendation platform needs to know the click preference of the user first, and then pertinently recommend advertisements or links related to the click preference to the user. The multi-user modeling method commonly used in the industry generally comprises grouping recommendation based on user basic attributes, user grouping recommendation based on a clustering algorithm and grouping recommendation based on business rules.
However, the grouping recommendation based on the basic attributes of the users needs to be attached to the grouping of the basic attributes, so that the single property of the recommendation is caused, the Martai effect is obvious, for example, a platform recommends financial products frequently browsed by the users and ignores the financial products not clicked by the users, the clustering algorithm is adopted to group the users based on the clustering algorithm, an untrusted result is generated to a certain extent, for example, the users browse car insurance, the platform recommends the user care insurance, the grouping recommendation based on the business rules consumes a large amount of labor operation cost, and is inflexible, time-consuming and labor-consuming, therefore, most of the user grouping commonly used in the industry can only achieve the grouping recommendation based on data statistics or an unsupervised algorithm, but in practice, the methods cannot measure the similarity between the users and the articles, lack of individuation, and the online effect is bad.
Aiming at the technical problems, the application provides an information prediction method based on a multi-task model, which comprises the following steps:
Fig. 1 is an optional flowchart of a method for predicting information based on a multitasking model according to an embodiment of the present application, where the method in fig. 1 may include, but is not limited to, steps S101 to S107.
It should be noted that the multitasking model includes a sharing layer, and the multitasking model is MMOE models.
Step S101, acquiring first historical behavior data of users in a first group and second historical behavior data of users in a second group;
the user types of the first group and the second group are different.
In step S101 of some embodiments, first historical behavior data of users in a first group and second historical behavior data of users in a second group are obtained, so that data information of different user types can be obtained.
It should be noted that, the first group may be a group of financial agents, engineers, teachers, doctors, financial analyzers, etc., and the second group may be a group of non-agents, experience users, students, patients, etc., and the embodiment is not limited specifically.
It is noted that the first historical behavior data and the second historical behavior data include portrait features of the first group and the second group, click-forward behavior and other preference features, for example, financial products frequently browsed by the first group and the second group, financial science popularization knowledge and the like, and content dimension of the click behavior and other information, where the content dimension includes counseling topic information, labels and other features, for example, names, amounts, warranty years and the like of related financial products.
Step S102, inputting the first historical behavior data and the second historical behavior data into a multi-task model for weight calculation to obtain a first attention vector and a second attention vector;
In step S102 of some embodiments, the first historical behavior data and the second historical behavior data are input into a multitasking model to perform weight calculation, so as to obtain a first attention vector and a second attention vector, thereby avoiding the influence caused by user interest drift and focusing the user interest.
Step S103, inputting the first attention vector, the second attention vector, the first historical behavior data and the second historical behavior data into a sharing layer for vector sharing, and outputting a sharing vector and feature sharing information;
in step S103 of some embodiments, the first attention vector, the second attention vector, the first historical behavior data and the second historical behavior data are input into the sharing layer to perform vector sharing, and the sharing vector and the feature sharing information are output, so that the problem that the feature sparsity difference between different groups is large is avoided, and the problem of model learning bias is solved.
The characteristics of the population with rich behaviors can be indirectly used as a certain supplement through sharing, so that the influence caused by sparsity of the population characteristics is reduced.
Step S104, differential learning is carried out on the shared vector and the characteristic shared information based on a preset gate mechanism, so as to obtain an embedded vector;
It should be noted that the multitasking model includes multiple expert networks, where the number of the multiple expert networks corresponds to the number of groups, and a gate mechanism is preset in each expert network.
In step S104 of some embodiments, the shared vector and the feature shared information are input into the multiple expert networks, so that the multiple expert networks perform differential learning on the shared vector and the feature shared information based on a preset gate mechanism to obtain the embedded vector, thereby enabling differential learning of differences between different groups and improving effects between different groups.
Step S105, performing feature learning on the multi-task model according to the embedded vector, the first historical behavior data and the second historical behavior data to obtain a trained multi-task model;
In step S105 of some embodiments, feature learning is performed on the multi-task model according to the embedded vector, the first historical behavior data and the second historical behavior data, so as to obtain a trained multi-task model, and improve generalization performance of the multi-task model and accuracy of prediction probability.
Step S106, acquiring first click information of the users in the first group and second click information of the users in the second group;
In step S106 of some embodiments, first click information of the user in the first group and second click information of the user in the second group are obtained, so that prediction of click information of the user in different groups by the subsequent multitasking model is facilitated, and association relation between the user and the article is further mined.
Step S107, inputting the first click information and the second click information into the trained multi-task model to perform click prediction, and obtaining first prediction information corresponding to the first click information and second prediction information corresponding to the second click information.
In step S107 of some embodiments, the first click information and the second click information are input into the trained multitasking model to perform click prediction, so as to obtain first prediction information corresponding to the first click information and second prediction information corresponding to the second click information, thereby realizing prediction of click behavior of the user, increasing flexibility of information prediction, and improving relevance between the user and the article.
In the steps S101 to S107 shown in the embodiment of the present application, first historical behavior data of a user in a first group and second historical behavior data of a user in a second group are obtained, so that data information of different user types can be obtained, the first historical behavior data and the second historical behavior data are input into a sharing layer to perform vector sharing, a sharing vector and feature sharing information are output, so that interest of the user is focused, a greater information sparsity between groups is avoided, then, differential learning is performed on the sharing vector and the feature sharing information based on a preset gate mechanism to obtain an embedded vector, so that differences between different groups can be learned, feature learning is performed on a multitask model according to the embedded vector, the first historical behavior data and the second historical behavior data, a trained multitask model is obtained, generalization performance and association performance of the multitask model are improved, finally, first click information of the user in the first group and second click information of the second group are obtained, the first click information and the second click information of the user in the first group are input into the corresponding multitask model to the first click prediction information and the second click prediction information of the second group, and the flexibility of the corresponding click prediction information of the first click prediction information and the second click prediction information of the user is improved, and the flexibility of the prediction information is improved.
Referring to fig. 2, in some embodiments, step S102 may include, but is not limited to, steps S201 to S203:
it should be noted that the multitasking model includes an attention structure.
Step S201, acquiring first preference characteristics of first historical behavior data and second preference characteristics of second historical behavior data;
in step S201 of some embodiments, the first preference feature of the first historical behavior data and the second preference feature of the second historical behavior data are acquired, so that interest features of users in different groups are obtained, and subsequent operations such as feature sharing are facilitated.
It should be noted that, the first preference feature includes features such as portrait features and click-to-forward behavior of the users in the first group, and the second preference feature includes features such as portrait features and click-to-forward behavior of the users in the second group, which are not limited in this embodiment.
It is to be appreciated that the representation features of the user include, but are not limited to, basic representations including the user's age, gender, etc., click-forward behavior includes, but is not limited to, behavior including the user's number of clicks, number of forwards, number of collections, etc., e.g., click-forward insurance four times, forward child care insurance two times, collect car insurance three times, etc.
Step S202, performing feature conversion on the first preference feature and the second preference feature to obtain a first feature vector and a second feature vector;
In step S202 of some embodiments, feature conversion is performed on the first preferred feature and the second preferred feature, where in the process of feature conversion, the first preferred feature and the second preferred feature are firstly scored to obtain a first feature value corresponding to the first preferred feature and a second feature value corresponding to the second preferred feature, and then vector representation is performed on the first feature value and the second feature value, for example, the first feature value is 1, the second feature value is 0, the first feature value is converted into a vector of [0.1,0,3,0,4], and the second feature value is converted into a vector of [0.3,0.4,0.5], so as to obtain a first feature vector and a second feature vector, thereby expressing many feature values in a vector manner and improving convergence of the multitask model.
In step S203, the first feature vector and the second feature vector are input into the attention structure to perform weight calculation, so as to obtain a first attention vector and a second attention vector.
In step S203 of some embodiments, the first feature vector and the second feature vector are input into the attention structure to perform weight calculation, so as to obtain the first attention vector and the second attention vector, thereby avoiding the influence of user interest drift and focusing on the user interest.
It will be appreciated that for a user who has acquired historical behavioural data, the learned weights are applicable to the same type of new user, for example, the historical behavioural data of the agent a has been acquired before, for the same type of agent B, the weights learned by the agent a are applicable to the agent B, for example, the agent a is a financial analyst, the relevant historical behavioural data is "how to insight into a financial market", and then for the agent B, which is also a financial analyst, the weights of "how to insight into a financial market" can be directly applied to the agent B.
Referring to fig. 3, in some embodiments, step S103 may include, but is not limited to, steps S301 to S302:
Step S301, inputting a first attention vector and a second attention vector into a sharing layer for vector splicing to obtain a sharing vector;
In step S301 of some embodiments, the first attention vector and the second attention vector are input into the sharing layer to perform vector stitching, so as to obtain a sharing vector, thereby avoiding the problem of larger difference of feature sparsity between different groups and solving the problem of model learning bias.
It should be noted that vector splicing means that the first attention vector and the second attention vector are spliced in a tail-end manner, so that a large shared vector is obtained, and training of a subsequent model is facilitated.
Step S302, the first historical behavior data and the second historical behavior data are input into a sharing layer to perform feature sharing, and sharing information is obtained.
In step S302 of some embodiments, the first historical behavior data and the second historical behavior data are input into the sharing layer to perform feature sharing, so as to obtain sharing information, and features of the group with abundant behaviors can be indirectly used as a certain supplement through sharing, so that influence caused by sparsity of group features is reduced.
Referring to fig. 4, in some embodiments, step S302 may include, but is not limited to, steps S401 to S405:
it should be noted that the shared layer includes a location bias network, also called a location bias network.
Step S401, determining first deviation information corresponding to the first historical behavior data and second deviation information corresponding to the second historical behavior data;
In step S401 of some embodiments, first bias information corresponding to the first historical behavior data and second bias information corresponding to the second historical behavior data are determined, avoiding the influence of bias in the first historical behavior data and the second historical behavior data.
Step S402, inputting first deflection information and second deflection information into a position deflection network to perform position learning, so as to obtain a first position vector and a second position vector;
In step S402 of some embodiments, the first deviation information and the second deviation information are input into the position deviation network to perform position learning, so as to obtain a first position vector and a second position vector, thereby avoiding bias influence, improving overall recommendation effect, and enabling other more commodities to have exposure opportunities.
It can be understood that the user frequently browses the related recommended links such as "endowment insurance", "car insurance", "currency", and the like, and less browses the related recommended links such as "child care insurance", "income tax", and "fund", so as to avoid the occurrence of the martai effect, the first deviation information corresponding to the first historical behavior data and the second deviation information corresponding to the second historical behavior data need to be determined, thereby improving the accuracy and fairness of the information and enhancing the credibility and the creditability of the information.
It should be noted that, certain bias information exists in the first historical behavior data and the second historical behavior data, wherein the bias information is mostly concentrated in the position bias, that is, the influence of different display positions on the click rate is different, for example, the click rate of the commodity displayed in front is higher, and the click rate of the commodity displayed in back is lower, so that the influence of bias needs to be avoided by determining the first bias information and the second bias information.
Step S403, performing feature stitching on the first position vector and the first historical behavior data to obtain first shared information;
step S404, performing feature stitching on the second position vector and the second historical behavior data to obtain second shared information;
step S405, generating shared information according to the first shared information and the second shared information.
In steps S403 to S405 of some embodiments, the first position vector and the second position vector output by the position deviation network are respectively feature-spliced with the first historical behavior data and the second historical behavior data to obtain first shared information and second shared information, and then the first shared information and the second shared information are integrated to obtain the shared information, so that the overall recommendation effect is improved, the influence of the position bias is avoided, the martai effect is avoided, and only financial products or financial articles frequently browsed or clicked by a user are avoided.
In the process of feature stitching, the first position vector is stitched with a plurality of features in the first historical behavior data, and the second position vector is stitched with a plurality of features in the second historical behavior data, so that the accuracy of prediction is improved.
Referring to fig. 5, in some embodiments, step S105 may further include, but is not limited to, steps S501 to S502:
it should be noted that the multitasking model includes a ranking model, where the ranking model includes a multi-layer perceptron.
Step S501, the embedded vector, the first historical behavior data and the second historical behavior data are input into a sequencing model to be back-propagated, so that a first error value and a second error value are obtained;
it should be noted that the back propagation mechanism is a learning mechanism of the neural network, and the ordering model includes an output layer, a hidden layer, an input layer, and the like.
In step S501 of some embodiments, the embedded vector, the first historical behavior data and the second historical behavior data are input into the sorting model to be counter-propagated, so as to obtain a first error value and a second error value, thereby being capable of calculating errors between the estimated value of the embedded vector and the actual value of the first historical behavior data and the actual value of the second historical behavior data, and facilitating the subsequent counter-optimization.
And step S502, reversely optimizing the embedded vector according to the first error value and the second error value to obtain a trained multi-task model.
In step S502 of some embodiments, the first error value and the second error value are counter-propagated from the output layer to the hidden layer until the first error value and the second error value are propagated to the input layer, so that a process of inverse optimization is completed, a trained multi-task model is obtained, and generalization performance of the multi-task model and accuracy of prediction probability are improved.
Referring to fig. 6, in some embodiments, step S501 includes, but is not limited to, step S601:
step S601, the embedded vector, the first historical behavior data and the second historical behavior data are input into a preset sorting model, so that the sorting model carries out nonlinear learning on the first historical behavior data and the embedded vector to obtain a first error value, and carries out nonlinear learning on the second historical behavior data and the embedded vector to obtain a second error value.
In step S601 of some embodiments, after the embedded vector, the first historical behavior data and the second historical behavior data are input into a preset sorting model, the sorting model performs nonlinear learning on the first historical behavior data and the embedded vector, so as to learn a relationship between the embedded vector and feature information in the first historical behavior data to obtain a first error value, and similarly, the sorting model performs nonlinear learning on the second behavior data and the embedded vector, so as to learn a relationship between the embedded vector and feature information in the second historical behavior data to obtain a second error value, thereby facilitating subsequent reverse optimization and improving generalization performance of the multi-task model.
Referring to fig. 7, in some embodiments, step S107 includes, but is not limited to, steps S701 to S702:
it should be noted that the multitasking model includes sigmiod functions.
Step S701, inputting the first click information and the second click information into a trained multitasking model, so that the multitasking model scores the first click information and the second click information to obtain a first prediction probability and a second prediction probability;
step S702, determining first prediction information corresponding to the first click information according to the first prediction probability, and determining second prediction information corresponding to the second click information according to the second prediction probability.
In steps S701 to S702 of some embodiments, the first click information and the second click information are input into a trained multitasking model, so that sigmiod functions in the multitasking model score the first click information and the second click information to obtain a first prediction probability and a second prediction probability, accuracy of the click information prediction of the user is improved, then the first prediction information corresponding to the first click information is determined according to the first prediction probability, the second prediction information corresponding to the second click information is determined according to the second prediction probability, prediction of the click behavior of the user is achieved, flexibility of information prediction is increased, and relevance between the user and the article is improved.
Referring to fig. 8, the embodiment of the application further provides an information prediction apparatus based on a multi-task model, where the multi-task model includes a sharing layer, the apparatus includes:
A historical behavior acquisition module 801, configured to acquire first historical behavior data of a user in a first group and second historical behavior data of a user in a second group, where user types of the first group and the second group are different;
The weight calculation module 802 is configured to input the first historical behavior data and the second historical behavior data into the multitask model for weight calculation, so as to obtain a first attention vector and a second attention vector;
The vector sharing module 803 is configured to input the first attention vector, the second attention vector, the first historical behavior data, and the second historical behavior data into the sharing layer to perform vector sharing, and output a sharing vector and feature sharing information;
The difference learning module 804 is configured to perform difference learning on the shared vector and the feature shared information based on a preset gate mechanism to obtain an embedded vector;
the feature learning module 805 is configured to perform feature learning on the multi-task model according to the embedded vector, the first historical behavior data, and the second historical behavior data, to obtain a trained multi-task model;
The click information module 806 is configured to obtain first click information of the user in the first group and second click information of the user in the second group;
The probability prediction module 807 is configured to input the first click information and the second click information into the trained multitasking model to perform click prediction, so as to obtain first prediction information corresponding to the first click information and second prediction information corresponding to the second click information.
The specific implementation manner of the information prediction device based on the multi-task model is basically the same as the specific embodiment of the information prediction method based on the multi-task model, and is not described herein.
The embodiment of the application also provides electronic equipment, which comprises a memory, a processor, a program stored on the memory and capable of running on the processor and a data bus for realizing connection communication between the processor and the memory, wherein the program is executed by the processor to realize the information prediction method based on the multi-task model. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.
Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:
the processor 901 may be implemented by a general purpose CPU (Central Processing Unit ), a microprocessor, an application specific integrated circuit (Application SpecificIntegrated Circuit, ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solution provided by the embodiments of the present application;
The Memory 902 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a random access Memory (Random Access Memory, RAM). The memory 902 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present disclosure are implemented by software or firmware, relevant program codes are stored in the memory 902, and the processor 901 invokes an information prediction method based on a multi-task model to perform the embodiments of the present disclosure;
An input/output interface 903 for inputting and outputting information;
The communication interface 904 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g. USB, network cable, etc.), or may implement communication in a wireless manner (e.g. mobile network, WIFI, bluetooth, etc.);
A bus 905 that transfers information between the various components of the device (e.g., the processor 901, the memory 902, the input/output interface 903, and the communication interface 904);
wherein the processor 901, the memory 902, the input/output interface 903 and the communication interface 904 are communicatively coupled to each other within the device via a bus 905.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the information prediction method based on the multi-task model when being executed by a processor.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
According to the information prediction method, the device, the electronic equipment and the storage medium based on the multi-task model, first, the first historical behavior data of the user in the first group and the second historical behavior data of the user in the second group are obtained, so that data information of different user types can be obtained, the first historical behavior data and the second historical behavior data are input into a sharing layer to carry out vector sharing, the sharing vector and the feature sharing information are output, the interest of the user is focused, the information sparsity among the groups is avoided to be large, then, differential learning is carried out on the sharing vector and the feature sharing information based on a preset gate mechanism to obtain embedded vectors, the difference among different groups can be learned, the effect among different groups is improved, then, feature learning is carried out on the multi-task model according to the embedded vectors, the first historical behavior data and the second historical behavior data, the trained multi-task model is obtained, the generalization performance and the association performance of the multi-task model are improved, finally, the first click information of the user in the first group and the second click information of the user in the second group are obtained, the first click information of the user and the second click information of the second group are correspondingly trained with the first click information, the prediction information of the first click information and the second click information is correspondingly predicted, and the flexibility of the predicted click information is improved.
The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
It will be appreciated by those skilled in the art that the solutions shown in fig. 1-7 are not limiting on the embodiments of the application and may include more or fewer steps than shown, or certain steps may be combined, or different steps.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" is used to describe an association relationship of an associated object, and indicates that three relationships may exist, for example, "a and/or B" may indicate that only a exists, only B exists, and three cases of a and B exist simultaneously, where a and B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one of a, b or c may represent a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method of the various embodiments of the present application. The storage medium includes various media capable of storing programs, such as a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk, or an optical disk.
The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and are not thereby limiting the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.