CN115640398A - Comment generation model training method, comment generation device and storage medium - Google Patents
Comment generation model training method, comment generation device and storage medium Download PDFInfo
- Publication number
- CN115640398A CN115640398A CN202211348439.4A CN202211348439A CN115640398A CN 115640398 A CN115640398 A CN 115640398A CN 202211348439 A CN202211348439 A CN 202211348439A CN 115640398 A CN115640398 A CN 115640398A
- Authority
- CN
- China
- Prior art keywords
- comment
- text
- sample
- target
- trained
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 59
- 238000012549 training Methods 0.000 title claims abstract description 48
- 238000004364 calculation method Methods 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims description 30
- 238000011156 evaluation Methods 0.000 claims description 28
- 238000013210 evaluation model Methods 0.000 claims description 26
- 238000004590 computer program Methods 0.000 claims description 20
- 239000013598 vector Substances 0.000 claims description 16
- 238000004458 analytical method Methods 0.000 claims description 8
- 238000010586 diagram Methods 0.000 description 14
- 230000003796 beauty Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000013145 classification model Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 230000033764 rhythmic process Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 102100033814 Alanine aminotransferase 2 Human genes 0.000 description 1
- 101100481876 Danio rerio pbk gene Proteins 0.000 description 1
- 101000779415 Homo sapiens Alanine aminotransferase 2 Proteins 0.000 description 1
- 101100481878 Mus musculus Pbk gene Proteins 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application provides a comment generation model training method, a comment generation method, a device and a storage medium, wherein the method comprises the following steps: acquiring a first sample comment text set in a music scene; adding sample comment prompt information at the sample position of each first sample comment text in the first sample comment text set; inputting each first sample comment text added with sample comment prompt information into a pre-trained comment generation model for category prediction to obtain a prediction category corresponding to each first sample comment text output by the pre-trained comment generation model; and performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to adjust the pre-trained comment generation model to obtain a trained comment generation model, wherein the trained comment generation model is used for generating comment texts under comment categories, and can be trained to obtain a comment generation model for automatically generating the comment texts, so that the comment texts can be intelligently and quickly generated.
Description
Technical Field
The application relates to the technical field of internet, in particular to a comment generation model training method, a comment generation device and a storage medium.
Background
With the rapid development of the internet technology, various music websites or mobile phone music applications on the internet provide countless songs, so that users can enjoy various music at any time and any place, and the demands of people are greatly met. In the prior art, after a user can enjoy a song, the user can write down a song or a comment of a singer corresponding to the song in a music comment area, but the mode is not intelligent enough and the comment text generation efficiency is low.
Disclosure of Invention
The embodiment of the application provides a comment generation model training method, a comment generation device and a storage medium, and comment texts under comment categories can be intelligently and quickly generated.
In one aspect, an embodiment of the present application provides a comment generation model training method, where the method includes:
the method comprises the steps of obtaining a first sample comment text set in a music scene, wherein the first sample comment text set comprises a plurality of first sample comment texts and a category label corresponding to each first sample comment text;
adding sample comment prompt information at the sample position of each first sample comment text, wherein the sample comment prompt information is used for indicating the comment category of each first sample comment text;
inputting each first sample comment text added with sample comment prompt information into a pre-trained comment generation model for class prediction to obtain a prediction class corresponding to each first sample comment text output by the pre-trained comment generation model;
performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value;
and adjusting the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, wherein the trained comment generation model is used for generating comment texts under comment categories.
In one aspect, an embodiment of the present application provides a comment generating method, where the method includes:
acquiring a target music identifier and comment prompt information corresponding to the target music identifier, wherein the comment prompt information is used for indicating a target comment category of a generated comment text;
inputting the comment prompt information and the target music identification into the trained comment generating model in the comment generating model training method, and obtaining candidate comment texts output by the trained comment generating model and under the target comment category;
and generating a candidate comment text output by the model according to the trained comment to obtain a target comment text, wherein the target comment text comprises music information corresponding to the target music identification.
In one aspect, an embodiment of the present application provides a comment generation model training apparatus, where the apparatus includes:
the obtaining unit is used for obtaining a first sample comment text set in a music scene, wherein the first sample comment text set comprises a plurality of first sample comment texts and a category label corresponding to each first sample comment text;
the processing unit is used for adding sample comment prompt information at the sample position of each first sample comment text, and the sample comment prompt information is used for indicating the comment category of each first sample comment text;
the processing unit is further configured to input each first sample comment text to which the sample comment prompt information is added into a pre-trained comment generating model for category prediction, so as to obtain a prediction category corresponding to each first sample comment text output by the pre-trained comment generating model;
the processing unit is further configured to perform loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value;
the processing unit is further configured to adjust the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, and the trained comment generation model is used for generating comment texts in comment categories.
In one aspect, an embodiment of the present application provides a comment generating apparatus, where the apparatus includes:
the comment text generation device comprises an acquisition unit, a display unit and a comment processing unit, wherein the acquisition unit is used for acquiring a target music identifier and comment prompt information corresponding to the target music identifier, and the comment prompt information is used for indicating a target comment category of a generated comment text;
the processing unit is used for inputting the comment prompt information and the target music identification into the comment generating model trained by the comment generating model training device, and obtaining candidate comment texts under the target comment category output by the trained comment generating model;
and generating a candidate comment text output by the model according to the trained comment to obtain a target comment text, wherein the target comment text comprises music information corresponding to the target music identification.
In one aspect, an embodiment of the present application provides a computer device, where the computer device includes a memory and a processor, and the memory stores a computer program, and when the computer program is executed by the processor, the processor executes a training method of the comment generation model or executes the comment generation method.
In one aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and when the computer program is read and executed by a processor of a computer device, the computer device is caused to execute the training method of the comment generation model or execute the comment generation method.
In one aspect, embodiments of the present application provide a computer program product, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the training method of the comment generation model or execute the comment generation method.
In the embodiment of the application, a first sample comment text set in a music scene is obtained, wherein the first sample comment text set comprises a plurality of first sample comment texts and a category label corresponding to each first sample comment text; adding sample comment prompt information to the sample position of each first sample comment text, wherein the sample comment prompt information is used for indicating the comment category of each first sample comment text; inputting each first sample comment text added with sample comment prompt information into a pre-trained comment generation model for category prediction to obtain a prediction category corresponding to each first sample comment text output by the pre-trained comment generation model; performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value; and adjusting the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, wherein the trained comment generation model is used for generating comment texts under comment categories, so that a comment generation model capable of automatically generating the comment texts can be obtained through training, and the comment texts can be intelligently and quickly generated.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1a is a schematic flow chart of a comment generation model training scheme provided by an embodiment of the present application;
fig. 1b is a schematic structural diagram of a comment generating system provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating a comment generation model training method provided by an embodiment of the present application;
fig. 3 is a schematic flowchart of a comment generation method provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of comment text generation provided by an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a comment generation model training apparatus provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a comment text generation apparatus provided in an embodiment of the present application;
fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a comment generation model training scheme and a comment generation scheme, as shown in fig. 1a, the comment generation model training scheme may include (1) a pre-training phase and (2) a model fine-tuning phase, and the comment generation scheme may include (3) an inference or application phase:
(1) A pre-training stage: a large number of music comment texts (such as 100 ten thousand music comment texts) are obtained, and an initial comment generation model (language model) is pre-trained by using the large number of music comment texts, so that the pre-trained comment generation model can be fused with information of the large number of music comment texts.
(2) A fine adjustment stage: and fine-tuning the pre-trained comment generation model. Specifically, some music comment texts can be selected as a well-chosen music comment text data set (hereinafter referred to as a first sample comment text set), where well-chosen is to select a music comment text satisfying any one or more of conciseness, generality, and beauty from a large number of music comment texts. The simplicity means that: a limitation on the text length of the music comment text, such as the text length of the music comment text being less than or equal to a length threshold; by generalizing, it is meant: the requirements on comment contents included in the music comment text are met, the music comment text has strong relevance with the commented music, the characteristics of the commented music can be summarized, and the like; the beauty means: the requirements for the expression of the music comment text are unique in description of the music comment text.
Then, different sample comment prompt messages are added to the selected music comment texts, the selected music comment texts are divided into positive examples or negative examples, and the selected music comment texts can be distinguished from one another through the sample comment prompt messages. Then, performing category labeling on the music comment text added with different sample comment prompt information to obtain a category label of the music comment text (for example, the category label 1 of the music comment text represents that the carefully chosen music comment text is a good comment or a forward comment, and the category label 0 of the music comment text represents that the carefully chosen music comment text is a bad comment or a reverse comment); the refined music comment text is made into binary data by category labeling.
Further, the pre-trained comment generation model is adjusted (i.e., fine-tuned) according to the music comment text added with different sample comment prompt information and the corresponding category label, so that the trained comment generation model is obtained. The pre-trained comment generation model is trained through the selected music comment text, so that the trained comment generation model generates the comment text with the characteristics of simplicity, generality, elegance and the like.
(3) In the reasoning or application stage, when the comment text of the positive example is to be generated, comment prompt information of the positive example can be given, and at this time, the comment prompt information can be used for indicating that the comment category of the generated comment text is the positive example (namely, the positive and positive comment text is generated). When the comment text of the counterexample is to be generated, the comment prompt information of the counterexample can be given, and at this time, the comment prompt information can be used for indicating that the comment category of the generated comment text is the counterexample (for example, negative or negative comment text is generated). Specifically, comment prompt information is input into the trained comment generation model, candidate comment texts output by the trained comment generation model and in the target comment category are obtained, and the target comment texts are determined according to the candidate comment texts in the target comment category.
The scheme provided by the embodiment of the application can generate comment texts about a certain song or a certain singer in a certain comment category. Specifically, a target music identifier such as a song title, a singer name and the like can be added as a suffix behind the comment prompt information, and the comment prompt information is used for indicating the target comment category of the generated target comment text. Then, inputting the comment prompt information and the target music identifier corresponding to the comment prompt information into the trained comment generating model, and obtaining a candidate comment text output by the trained comment generating model and in the target comment category, where the candidate comment text may include the music information corresponding to the target music identifier. As one implementation, candidate comment texts under the target comment category may be directly determined as the target comment text. As another implementation manner, when the number of candidate comment texts is multiple, multiple candidate comment texts output by the trained comment generation model may be screened, and finally, a target comment text in the target comment category is obtained. Specifically, a trained evaluation model is introduced. After the trained comment generation model outputs a plurality of candidate comment texts, the trained evaluation model can be used for grading the beauty degree of each candidate comment text, and the candidate comment text corresponding to the highest evaluation score is used as the target comment text, so that the quality of the generated comment text can be improved. Wherein, the trained evaluation model can be a classification model; when the evaluation model is trained, the evaluation model can be trained by acquiring sample data and labeling information (such as grace/not grace) corresponding to the sample data, so as to obtain the trained evaluation model. It should be understood that, in the embodiment of the present application, the comment text related to music may be simply referred to as comment text in the following description.
The song name or the target comment text of the singer in the target comment category can be used as some reference information of the user in selecting the song. For example, the trained comment generation model can generate comment texts of songs in the negative category, the comment texts in the negative category can be negative comments such as too strong rhythm and too big sound, and then some quiet-liked objects can filter out the songs. For another example, the comment text in the positive category may be generated by a trained comment generation model, and the comment text in the positive category may be used to generate positive comments with gentle rhythm and beautiful lyrics, so that some objects may select the song for enjoyment.
Through the comment generation scheme, the embodiment of the application has the following beneficial effects: (1) A trained comment generation model can be obtained through the pre-training stage and the fine-tuning stage in a trainable mode, and the trained comment generation model can automatically and conveniently generate a comment text in a certain comment category, so that the comment text generation efficiency is improved; (2) Text comments (such as comment texts about XX) which meet requirements and are associated with the target music identification can be flexibly controlled and generated through the target music identification and the comment prompt information, and the problems that topics involved in the music comment texts are relatively complicated and many comment texts are irrelevant to the current music are solved; (3) The generated candidate comment texts are screened according to the grace degree through the evaluation model, so that the quality of the generated comment texts can be improved to a certain extent, and more objects are attracted to pay attention to the music corresponding to the comment texts.
The comment generation system provided in the embodiment of the present application will be explained in detail next. Please refer to fig. 1b, and fig. 1b is a schematic structural diagram of a comment generating system provided in the embodiment of the present application. The comment generating system may include: the terminal device 101 and the server 102 are not limited in the present application to the number of computer devices, and certainly, the number of servers may also be multiple, and the present application still does not limit the number of servers. The terminal device 101 and the server 102 in the comment generating system may be directly or indirectly connected by wired or wireless communication. Wherein:
various music clients, such as music APPs (applications) that run independently or music subroutines (commonly called applets) in various APPs, run in the terminal device 101; the object can log in a music client, and can listen to songs, comment and share a certain song in the music client, and the like. The terminal device 101 may include, but is not limited to, a mobile terminal (e.g., a smart phone), a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, a vehicle-mounted terminal, a smart wearable device, an aircraft, a smart appliance, and so on.
The server 102, which may correspond to a client, provides technical support for a service provided by the client. The server 102 may store song names of various songs, names of corresponding singers, comment texts for the songs or the singers, and the like, which is not limited in this application. The server 102 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like.
In one embodiment, taking the terminal device 101 and the server 102 as an example to describe the comment generation flow, the comment generation flow may be:
(1) when an object (such as a singer) logs in a music client in the terminal device 101, the music client in the terminal device 101 may send a comment acquisition request to the server 102, and the server 102 may acquire comment prompt information and a target music identifier according to the comment acquisition request, where the comment prompt information is used to indicate a target comment category of the generated target comment text, for example, the target music identifier is XX song, and the comment prompt information is used to indicate that the target comment category of the generated target comment text is "positive comment".
(2) The server 102 inputs the comment prompt information and the target music identification into the trained comment generating model to obtain one or more candidate comment texts output by the trained comment generating model in the target comment category, and determines the target comment text from the one or more candidate comment texts in the target comment category. If the XX song and comment prompt information are input into the trained comment generation model, one or more candidate comment texts output by the trained comment generation model under the positive comment category are obtained, and if the candidate comment texts are 'the XX song is distinctive' and 'the XX song is graceful'; then, a target comment text is determined from the one or more candidate comment texts under the positive comment category, i.e., the target comment text is "graceful lyrics of XX song".
(3) The server 102 sends the target comment text to the music client in the terminal device 101, and after receiving the target comment text, the music client in the terminal device 101 can display the target comment text in the target position of the music client so that the target comment text can be viewed by the target. If the target comment text "XX song's lyrics are graceful" is displayed in the target position (client's home page middle position, right position, floating window), the subject can view these comment texts at the target position, decide whether to listen to the "XX song" song.
It should be understood that the server 102 may input the comment prompt information and the target music identifier into the trained comment generating model in advance, generate and store the comment text through the trained comment generating model, and when receiving a comment acquisition request sent by the music client, may directly acquire the corresponding comment text and return the comment text to the music client, which may facilitate the music client to quickly acquire the comment text.
In one embodiment, the comment generation scheme provided by the embodiment of the present application can be applied to the following scenarios:
(1) And generating a target comment text of a corresponding song or a corresponding singer in a target comment category according to the comment generation scheme, wherein the target comment text comprises music information corresponding to the song or the singer, and the target comment text can be used for describing the characteristics of the song or the singer. Then, the comment text is displayed in the music client to recommend the subject about the song or singer to help the subject quickly understand the characteristics of the song or singer, so that the subject can select whether to listen to the song or the song corresponding to the singer according to the target comment text. For example, an object listened to a song "XXX" in a music client may generate a comment text "" XXX "lyrics graceful in a forward direction (i.e., a target comment category) with respect to the song" XXX ", and display the comment text" "XXX" lyrics graceful "in the music client to recommend the song" XXX "to the object, the object may quickly know the characteristics of the song" XXX "(i.e., the lyrics graceful), and then select whether to listen to the song" XXX ".
(2) The method comprises the steps of obtaining songs frequently listened by different objects in a music client or songs of singers, then generating target comment texts in the target comment categories according to a comment generation scheme, wherein the target comment texts comprise music information corresponding to the songs or the singers, displaying the target comment texts in the music client, enabling any object to view the target comment texts in the music client, and directly searching the songs or the singers through the target comment texts. For example, different objects all listen to the song of the singer's pinkish' in the music client, the target comment text in the target comment category may be generated according to the singer's pinkish', if the target comment text is 'singer' pinkish 'for expressing the song and tuning up', the target comment text 'singer' pinkish 'for expressing the song and tuning up' is displayed in the music client, and any object may trigger the 'singer' pinkish 'for expressing the song and tuning up' search 'singer' pinkish ', and listen to the corresponding song of' singer 'pinkish'.
Through the above comment generation scheme, the embodiment of the application has the following beneficial effects: (1) A trained comment generation model can be obtained through the pre-training stage and the fine-tuning stage in a trainable mode, the trained comment generation model can automatically and conveniently generate comment texts under a certain comment category, and the efficiency of generating music comment texts is improved; (2) Text comments (such as comment texts about XX) which meet requirements and are associated with the target music identification can be flexibly controlled and generated through the target music identification and the comment prompt information, and the problems that topics involved in the music comment texts are relatively complicated and many comment texts are irrelevant to the current music are solved; (3) The generated candidate music comment texts are screened according to the grace degree through the evaluation model, so that the quality of the generated comment texts can be improved to a certain extent, and more objects are attracted to pay attention to the music corresponding to the comment texts.
Referring to fig. 2, fig. 2 is a schematic flowchart of a comment generation model training method according to an embodiment of the present application. The training method of the comment generating model may be executed by a computer device, and may include the following steps S201 to S205:
s201, a first sample comment text set in a music scene is obtained, wherein the first sample comment text set comprises one or more first sample comment texts and a category label corresponding to each first sample comment text. Wherein the first sample comment text may be a music comment text for a certain song or singer. The category label may be used to indicate whether the first sample comment text is a positive example (i.e., positive comment) or a negative example (i.e., negative comment). When the first sample comment text is a positive example, the category tag may be set to 1; when the first sample comment text is a counterexample, the category tag may be set to 0. It should be understood that the category labels may be set as desired, and the present application is not limited thereto.
In one embodiment, a computer device may obtain a first sample comment text set related to music from a device dedicated to storing music comment data; alternatively, the computer device obtains the first sample comment text set related to music directly from the internet.
S202, adding sample comment prompt information at the sample position of each first sample comment text, wherein the sample comment prompt information is used for indicating the comment category of each first sample comment text.
The sample position may be a start position of each first sample comment text, and for example, the first sample comment text is "this song is good", and the sample position is in front of a corresponding position of "this song is good" in the first sample comment text, that is, the song is good [ SEP ]. For another example, the first sample comment text is "good lyrics", and the sample position is before the corresponding position of "song" in the first sample comment text "good lyrics", i.e. good [ SEP ] lyrics are good. It should be understood that the sample position may be set according to the requirement, and the present application is not limited thereto.
S203, inputting each first sample comment text added with the sample comment prompt information into a pre-trained comment generating model for class prediction to obtain a prediction class corresponding to each first sample comment text output by the pre-trained comment generating model.
The Pre-trained review generation model is an auto-regression model, and may be, for example, GPT (Generative Pre-Training model), GPT2 (2 nd version of Generative Pre-Training model, transform-based decoder), GPT3 (3 rd version of Generative Pre-Training model), laMDA (conversational application language model for digital Applications), and the like. The pre-trained comment generation model comprises a decoding module (namely a transform-based decoder) and a linear layer; and processing the sample comment words in each first sample comment text added with the sample comment prompt information by using a decoding module to obtain a decoding result of the sample comment word corresponding to each first sample comment text, and then performing category prediction on the decoding result of the sample comment word corresponding to each first sample comment text by using a linear layer to obtain a prediction category corresponding to each first sample comment text.
In one embodiment, the initial comment generating model may be pre-trained to obtain a pre-trained comment generating model, and at this time, the initial comment generating model is trained by unsupervised learning to obtain a pre-trained comment generating model, and the pre-trained comment generating model has certain music knowledge and the ability to generate a general comment text. Specifically, a second sample comment text set in a music scene is obtained, wherein the second sample comment text set comprises one or more second sample comment texts; inputting each second sample comment text into an initial comment generation model, and predicting sample comment words at each position in each second sample comment text by the initial comment generation model to obtain a probability value of the sample comment words corresponding to each position in each second sample comment text output by the initial comment generation model; performing model loss calculation according to the probability value of the sample comment words corresponding to each position to obtain model loss; and optimizing the comment generation model based on the model loss to obtain a pre-trained comment generation model.
The number of the second sample comment texts in the second sample comment text set may be, for example, 100 ten thousand, 20 ten thousand, etc., training the initial comment generation model through a large number of second sample comment texts may enable the model to learn a large-scale music comment feature, and subsequently using the pre-trained comment generation model may enable music-style comment text generation.
The probability values of the sample comment words corresponding to the positions can be subjected to model loss calculation by using a first model loss function, so that model loss values are obtained. The first model loss function is as follows:
wherein Loss1 (W) represents a model Loss value, W represents a second sample comment text, k represents a window size of an initial comment generative model, theta represents a model parameter of the initial comment generative model, and W represents a window size of the initial comment generative model i Representing a sample comment word corresponding to the ith position in the first K positions in the second sample comment text, w i |w i-k ,...w i-1 Sample comment word w representing prediction of current position (ith position) based on sample comment words corresponding to (i-K) th to (i-1) th positions in first K positions in second sample comment text i ;P(w i |w i-k ,...,w i-1 (ii) a θ) sample comment word w representing the predicted current position (ith position) i The probability value of (2).
In one embodiment, a computer device may obtain a second sample set of comment text in a music scene from a device dedicated to storing music comment data; or the computer device may obtain the second sample set of comment text in the music scene directly from the internet. It should be understood that the second sample comment text in the second sample comment text set may be any comment regarding music.
The first sample comment text set may be a sample comment text set that meets requirements (e.g., meets conciseness and generality) obtained from the second sample comment text set. Of course, the first sample comment text set and the second sample comment text set may also be obtained from different locations, which is not limited in this application.
And S204, performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value.
In specific implementation, a second model loss function can be used to perform loss calculation on the prediction category and the category label corresponding to each first sample comment text, so as to obtain a loss value. Wherein the second model loss function is as follows:
where Loss2 (C) represents the total Loss value, C represents the second sample comment corpus, and y represents the category label. x is the number of 1 ,...,x m Representing the m first sample comment texts.
In some embodiments, performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text, and obtaining the loss value may also be implemented by: the output of the pre-trained comment generation model can also comprise probability values of comment words corresponding to all positions in each first sample comment text besides the category label of each first sample comment text, loss calculation is carried out according to the probability values of the sample comment words corresponding to all the positions to obtain a first loss value, loss calculation is carried out according to the prediction category and the category label corresponding to each first sample comment text to obtain a second loss value, and a total loss value is determined according to the first loss value and the second loss value.
S205, adjusting the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, wherein the trained comment generation model is used for generating comment texts under comment categories.
In specific implementation, based on the total loss value, the pre-trained comment generation model can be subjected to fine tuning processing to obtain the trained comment generation model. Wherein the fine tuning may be: and adjusting part of network parameters in the pre-trained comment generation model, or adjusting all network parameters in the pre-trained comment generation model.
The text length of the generated comment text can be controlled from the perspective of the model aiming at the conciseness of the comment text. Therefore, the sample comment texts can be subjected to descending position coding during model training so that the text length of the output comment texts is smaller than or equal to the length threshold. The embodiment of the application can adopt descending position coding in a pre-training stage (the process of obtaining a pre-trained comment generation model) or in the process of obtaining a trained comment generation model; since the structures of the pre-trained comment generation model and the trained comment generation model are the same, the training of the descending position codes of the sample comment texts is similar, and therefore the explanation is carried out by the descending position codes in the pre-training stage.
In an embodiment, the specific implementation manner of predicting the sample comment words at each position in each sample comment text by the initial comment generating model to obtain the probability value of the sample comment words corresponding to each position in each second sample comment text output by the initial comment generating model may be: extracting embedded vectors of sample comment words corresponding to all positions in each second sample comment text by the initial comment generation model, and sequentially performing descending position coding processing on all positions in each second sample comment text to obtain position vectors corresponding to all positions in each second sample comment text; and then, respectively performing mask attention analysis on each position in each second sample comment text and the embedded vector of the sample comment word corresponding to each position by a decoding module in the initial comment generation model to obtain the probability value of the sample comment word corresponding to each position in each second sample comment text.
The descending position code is used for controlling the text length of the generated comment text to be less than or equal to a length threshold value; the specific implementation process of sequentially adopting descending position coding processing for each position in each second sample comment text is as follows: sequentially coding each position in each second sample comment text in a descending order from left to right to obtain a position vector corresponding to each position in each second comment text, if the second sample comment text is 'little singing true and good hearing', sequentially coding each position in 'little singing true and good hearing' in a descending order from left to right, if the position corresponding to 'little' is coded to be 1.0, the position corresponding to 'red' is coded to be 0.9, the position corresponding to 'singing' is coded to be 0.8 …, and sequentially analogizing the position corresponding to 'listening' to be 0.2; for another example, the second sample comment text is a little-red singing very good-hearing voice, each position in the little-red singing very good-hearing voice very good stick is sequentially encoded in a descending order, for example, the position corresponding to the little is encoded into 1.0, the position corresponding to the red is encoded into 0.9, the position corresponding to the singing is encoded into 0.8 …, and the like, the position corresponding to the little is encoded into 0, the position corresponding to the very little is encoded into 0.1, and the position corresponding to the stick is encoded into 0.09.
The mask attention analysis means that only comment words corresponding to positions before a certain position are focused on for prediction of the comment words corresponding to the certain position. Taking the target position in each second sample comment text as an example, the decoding module in the initial comment generation model performs mask attention analysis on each position in the target second sample comment text and the embedded vector of the sample comment word corresponding to each position, so as to obtain the probability value of the sample comment word corresponding to each position in the target second sample comment text, which may be specifically implemented as follows: respectively performing mask attention analysis on an embedded vector of a sample comment word corresponding to a position before a target position in each second sample comment text by a decoding module in the initial comment generation model to obtain a weight value of the sample comment word corresponding to the target position, and performing mask attention analysis on a position vector of the target position by the decoding module in the initial comment generation model to obtain a position weight value of the target position; and finally, a linear layer in the initial comment generation model is adjusted to process the weight value of the sample comment word corresponding to the target position and the position weight value of the target position, and the probability value of the sample comment word corresponding to the target position is obtained.
In one embodiment, adjusting the pre-trained comment generating model based on the total loss value, and obtaining the trained comment generating model may include: and adjusting the pre-trained comment generation model based on the total loss value to obtain an intermediate comment generation model, calling the intermediate comment generation model to generate a comment text according to the sample comment prompt information, and adding the generated comment text into the first sample comment text set. The method comprises the steps of generating multiple comment texts under the same data distribution through an intermediate comment generation model to achieve data set enhancement, training the intermediate comment generation model through a first sample comment set added with the comment texts to obtain a trained comment generation model, and training the model through the data enhancement to obtain the model with better effect. The training process of the intermediate comment generation model may refer to the training process of the pre-trained comment generation model, and is not described herein again.
The initial comment generation model, the pre-trained comment generation model, and the trained comment generation model have the same model architecture.
In the embodiment of the application, a first sample comment text set in a music scene is obtained, wherein the first sample comment text set comprises one or more first sample comment texts and a category label corresponding to each first sample comment text; adding sample comment prompt information to the sample position of each first sample comment text, wherein the sample comment prompt information is used for indicating the comment category of each first sample comment text; inputting each first sample comment text added with sample comment prompt information into a pre-trained comment generation model for category prediction to obtain a prediction category corresponding to each first sample comment text output by the pre-trained comment generation model; performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value; and adjusting the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, wherein the trained comment generation model is used for generating a comment text, so that the comment text can be generated, and the comment text generation efficiency is improved.
Based on the comment generation scheme, please refer to fig. 3, and fig. 3 is a schematic flow diagram of a comment generation method provided in the embodiment of the present application. The comment generating method may be executed by a computer device, which may be the above-described server or terminal device. The comment generating method described in this embodiment may include the following steps S301 to S303:
s301, obtaining a target music identification and comment prompt information corresponding to the target music identification, wherein the comment prompt information is used for indicating the target comment category of the generated comment text. Wherein, the target music identification can be singer name, song title, lyrics, etc.; the target comment category may include a forward (positive) comment category and a reverse (negative) comment category, where a positive comment refers to a comment that is forward of a target music identification, for example, the target music identification is a song name "XX", and the comment text corresponding to the forward comment category may be: "XX" lyrics good, etc.; the reverse comment is a comment on the reverse side of the target music identifier, for example, the target music identifier is song name "XX", and the comment text corresponding to the reverse comment category may be: too strong of an "XX" rhythm, too high of an "XX" tone, etc. In some alternative embodiments, the comment categories may also include music style categories, such as balladry, rock, and the like.
In one embodiment, the comment prompt message may be a combination of a target comment category and a special symbol, and the special symbol may be set as needed. For example, the special symbol is a separation prompt SEP, and the comment prompt information may be "good [ SEP ]", "bad [ SEP ]"; at this time, the comment prompt information "good [ SEP ]" indicates that the target comment category of the generated comment text is the forward comment category, and "bad [ SEP ]" indicates that the target comment category of the generated comment text is the reverse comment category.
In another embodiment, the comment prompt message in the embodiment of the present application may be a special symbol, and the special symbol may be used to indicate a target comment category of the generated comment text, for example, a special symbol "a" is used to indicate that the target comment category of the generated comment text is a forward comment category, and a special symbol "B" is used to indicate that the target comment category of the generated comment text is a reverse comment category.
The method for acquiring the target music identifier may be as follows: (1) historical song listening information of one or more objects (namely, the song listeners) is acquired, and a target music identification is determined from the historical song listening information. Specifically, the historical song listening information includes the frequency of listening to different music by the object, and the music identifier of the music corresponding to the frequency greater than the frequency threshold value can be used as the target music identifier. (2) And providing a text comment generating interface, wherein the text comment generating interface comprises a music identification input box, and a target music identification corresponding to a comment text to be generated can be directly input into the music identification input box.
S302, inputting the comment prompt information and the target music identification into the trained comment generation model, and obtaining candidate comment texts output by the trained comment generation model and under the target comment category. Wherein, the number of candidate comment texts may be one or more. Each candidate comment text may include music information corresponding to the target music identification. The music information may include information related to music corresponding to the target music identifier, such as the target music identifier being a song name, and the music information may include an introduction of a song corresponding to the song name. The trained comment generating model may be a comment generating model trained by the above-described comment generating model training method.
In an embodiment, the comment prompt information and the target music identifier are input into the trained comment generating model, and a specific implementation manner of obtaining the candidate comment text output by the trained comment generating model under the target comment category may be as follows: inputting a target comment category and a target music identifier included in comment prompt information into a trained comment generation model, and predicting a comment word at a first position to obtain a comment word corresponding to the first position in the target comment category; then, predicting the comment words at the second position based on the comment words corresponding to the first position and the target music identification to obtain comment words corresponding to the second position under the target comment category, wherein the second position is the next position adjacent to the first position; and combining the comment words corresponding to the first position and the comment words corresponding to the second position to obtain candidate comment texts under the target comment categories output by the trained comment generation model. For example, as shown in fig. 4, the comment prompting information is "good [ SEP ]", the comment prompting information is used to indicate that the comment category of the generated comment text is the forward comment category, and the target music is identified as the song "small"; then, the song "small" is added behind the comment prompt information "good [ SEP ]", that is, "good [ SEP ] small". Then, inputting "good [ SEP ] small" to the trained comment generating model to predict comment words at the first position, so as to obtain comment words corresponding to the first position, as shown in fig. 4, inputting "good [ SEP ] small" to the trained comment generating model, so as to obtain that the comment words corresponding to the first position are "true", adding "true" of the comment words corresponding to the first position to the rear of the target music identifier "small" so as to obtain "good [ SEP ] small [ SEP ] true", then inputting "good [ SEP ] true" to the trained comment generating model, performing comment word prediction on comment words at the second position, so as to obtain "good" of the comment words corresponding to the second position, and then generating a candidate comment text "small true" in the comment category according to the comment words corresponding to the first position and the comment words corresponding to the second position. When the trained comment generation model is applied and a corresponding song or a singer comment text is generated for adding a certain song or a certain singer, a separation prompt symbol needs to be additionally added when comment prompt information and a target music identifier are input into the trained comment generation model, so that the influence of two words of the song or the singer on the trained comment generation model can be reduced, and the trained comment generation model can generate candidate comment texts corresponding to the song or the singer more smoothly.
It should be understood that when a candidate comment text is generated, when the text length of the candidate comment text is long, the comment word corresponding to the second position can be continuously added to the position next to the comment word corresponding to the first position, the comment word corresponding to the third position is continuously input to the trained comment generation model to predict the comment word corresponding to the third position, and by analogy, comment words corresponding to various positions can be finally generated, and the comment words corresponding to various positions are combined to obtain the candidate comment text in the target comment category.
The specific implementation manner of inputting the target comment category and the target music identifier included in the comment prompt information into the trained comment generation model, predicting the comment word at the first position, and obtaining the comment word corresponding to the first position in the target comment category may be: and inputting comment prompt information and a target music identifier into a trained comment generation model, predicting comment words at a first position to obtain one or more candidate comment words corresponding to the first position and the probability of each candidate comment word, and selecting one candidate comment word from the one or more candidate comment words as the comment word corresponding to the first position according to the probability of each candidate comment word. As one implementation, a topK sampling algorithm may be used to determine a comment word corresponding to a first position from one or more candidate comment words. In specific implementation, one or more candidate comment words are ranked from high to low according to probability to obtain a ranking result. And then, selecting the first K candidate comment words from the sequencing result, and randomly sampling one comment word from the first K candidate comment words to serve as the comment word corresponding to the first position. K is an integer greater than or equal to 1 and is set according to requirements. In some optional embodiments, the comment word with the highest probability may also be selected from the first K candidate comment words as the comment word corresponding to the first position. As another implementation, selecting one candidate comment word from the one or more candidate comment words according to the probability of each candidate comment word may be: and selecting the comment word with the highest probability from one or a plurality of candidate comment words as the comment word corresponding to the first position.
It should be understood that, the determining manner of the comment word corresponding to the second position may be referred to the determining manner of the comment word corresponding to the first position, and is not described herein again. When the number of the comment words corresponding to the first position is multiple, and the number of the comment words corresponding to the second position is multiple, multiple candidate comment texts output by the trained comment generation model exist at the moment.
And S303, generating a candidate comment text output by the model according to the trained comment to obtain a target comment text, wherein the target comment text comprises music information corresponding to the target music identification. The target comment text under the target comment category can be understood as comment text having any one or more of the characteristics of conciseness, generality, beauty (or uniqueness).
The computer device may determine a target comment text under the target comment category from the one or more candidate comment texts output by the trained comment generation model. The specific mode for determining the comment text in the target comment category according to the one or more candidate comment texts output by the trained comment generation model is as follows:
(1) One or more candidate comment texts output by the trained comment generation model can be used as target comment texts.
(2) One candidate comment text can be randomly selected from one or more candidate comment texts output by the trained comment generation model as a target comment text.
(3) One or more candidate comment texts output by the trained comment generation model can be screened according to the text length of the candidate comment texts, the evaluation scores of the candidate comment texts and the like, so that the most appropriate target comment text is determined. In one implementation, the text length of each candidate comment text is determined, then the text length of each candidate comment text is compared with a length threshold, and the candidate comment text with the text length less than or equal to the length threshold is determined as the target comment text in the target comment category. Wherein the length threshold may be set to 6 characters, 7 characters, 15 characters, 20 characters, etc.; the length threshold may be set according to requirements or according to the size of the screen of the computer device, and if the size of the screen is capable of displaying 15 characters, the length threshold may be set to 15 characters, which is not limited in the embodiments of the present application. Through the setting of the length threshold value, the displayable area of the screen is fully utilized, the conciseness of the comment text is improved while the comment text is displayed in the screen well, the length of the comment text can be controlled, and the problems that the comment text only contains one word and two words, the comment text with the length of two to three hundred is provided, too short comments cannot contain useful information, and too long comments cannot extract useful text are solved.
In another implementation manner, the candidate comment texts can be evaluated by introducing a trained evaluation model, and the target comment texts in the target comment categories are screened from one or more candidate comment texts output by the trained comment generation model according to the evaluation result. The trained evaluation model can be used for evaluating dimensions such as beauty degree, uniqueness and the like of the candidate comment text; the trained evaluation model may be a classification model, for example, the classification model may be CNN (Convolutional Neural Network), cyclic Neural Network. Specifically, the computer device may input each candidate comment text into a trained evaluation model, perform scoring processing on each candidate comment text to obtain an evaluation score of each candidate comment text output by the trained evaluation model, and then take the candidate comment text with the highest evaluation score as a target comment text in the target comment category; or, the candidate comment texts with the evaluation scores larger than the evaluation threshold are used as the target comment texts in the target comment categories. Wherein, the evaluation threshold value can be set according to the requirement. The comment text determined by the trained evaluation model can solve the problems that music comments contain a large number of words and sentences with low quality, the attractiveness is low, users are difficult to attract and the like, and the quality of the generated comment text is improved.
The training data for training the evaluation model are different according to different evaluation dimensions. If the trained evaluation model is used for evaluating the goodness of the candidate comment text, the marking information corresponding to the sample comment text is used for indicating whether the sample comment text is beautiful or not, then the computer equipment can use the evaluation model to perform goodness grading processing on the sample comment text to obtain the evaluation score of the sample comment text, and model parameters in the evaluation model are adjusted according to the evaluation score corresponding to the sample comment text and the corresponding marking information to obtain the trained evaluation model. Wherein the number of the sample comment texts is multiple.
In another implementation manner, in order to obtain a comment text with higher quality, in the embodiment of the application, one or more candidate comment texts may be screened according to the text length and the evaluation score, so as to obtain a target comment text in a target comment category. Specifically, the computer device may determine the text length of each candidate comment text, determine the candidate comment text corresponding to the text length less than or equal to the length threshold as the reference comment text, then call the trained evaluation model to perform scoring processing on the reference comment text to obtain an evaluation score corresponding to the reference comment text, and take the candidate comment text corresponding to the highest evaluation score as the target comment text in the target comment category.
It should be understood that, when determining a target comment text in a target comment category from one or more candidate comment texts output by a trained comment generation model, the embodiment of the present application may determine the target comment text in the target comment category according to one or more of a text length and an evaluation score. For example, one or more candidate comment texts may be first screened according to the evaluation scores; and then, secondary screening is carried out on the screened candidate comment texts according to the text length to obtain target comment texts in the target comment categories, which is not limited by the application.
In the embodiment of the application, comment prompt information and a target music identifier are obtained, wherein the comment prompt information is used for indicating the target comment category of the generated comment text; inputting comment prompt information and a target music identifier into the trained comment generation model, and obtaining candidate comment texts output by the trained comment generation model and under the target comment category; the method comprises the steps of obtaining a target comment text according to a candidate comment text output by a trained comment generation model, wherein the target comment text comprises music information corresponding to a target music identification, conveniently generating music comments through the trained comment generation model, meanwhile improving efficiency of comment text generation, flexibly controlling and generating text comments (such as comment texts about XX) which meet requirements and are related to the target music identification through the target music identification and comment prompt information, and solving the problems that topics related to the comment text are complicated, and many comment texts are unrelated to current music.
The structural schematic diagram of the training device for the comment generating model provided by the embodiment of the application is shown in fig. 5; the training means of the comment generating model may be a computer program (including program code) running in the computer device, for example the training means of the comment generating model may be an application software in the computer device; the training means of the comment generation model may be used to perform some or all of the steps in the method embodiment shown in fig. 2. The training device for the comment generation model comprises the following units:
an obtaining unit 501, configured to obtain a first sample comment text set in a music scene, where the first sample comment text set includes one or more first sample comment texts and a category label corresponding to each first sample comment text;
a processing unit 502, configured to add sample comment prompting information to a sample position of each first sample comment text, where the sample comment prompting information is used to indicate a comment category of each first sample comment text;
the processing unit 502 is further configured to input each first sample comment text to which the sample comment prompt information is added into a pre-trained comment generating model for category prediction, so as to obtain a prediction category corresponding to each first sample comment text output by the pre-trained comment generating model;
the processing unit 502 is further configured to perform loss calculation according to the prediction category and the category label corresponding to each first sample comment text, so as to obtain a total loss value;
the processing unit 502 is further configured to adjust the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, where the trained comment generation model is used to generate a comment text in a comment category.
In one embodiment, the obtaining unit 501 is further configured to obtain a second sample comment text set in a music scene, where the second sample comment text set includes one or more second sample comment texts;
the processing unit 502 is further configured to input each second sample comment text into an initial comment generation model, and perform sample comment word prediction on each position in each sample comment text by using the initial comment generation model to obtain a probability value of a sample comment word corresponding to each position in each second sample comment text output by the initial comment generation model; performing model loss calculation according to the probability value of the sample comment word corresponding to each position in each second sample comment text to obtain model loss; and optimizing the initial comment generation model based on the model loss to obtain a pre-trained comment generation model.
In an embodiment, when the initial comment generating model predicts sample comment words at respective positions in each sample comment text to obtain probability values of the sample comment words corresponding to the respective positions in each second sample comment text output by the initial comment generating model, the processing unit 502 may be specifically configured to:
extracting embedded vectors of sample comment words corresponding to all positions in each second sample comment text by the initial comment generation model, and sequentially performing descending position coding processing on all positions in each second sample comment text to obtain position vectors corresponding to all positions in each second sample comment text; the descending position code is used for controlling the text length of the generated comment text to be less than or equal to a length threshold value;
and respectively performing mask attention analysis on each position in each second sample comment text and the embedded vector of the sample comment word corresponding to each position by a decoding module in the initial comment generation model to obtain a probability value of the sample comment word corresponding to each position in each second sample comment text.
In the embodiment of the application, a first sample comment text set in a music scene is obtained, wherein the first sample comment text set comprises one or more first sample comment texts and a category label corresponding to each first sample comment text; adding sample comment prompt information to the sample position of each first sample comment text, wherein the sample comment prompt information is used for indicating the comment category of each first sample comment text; inputting each first sample comment text added with the sample comment prompt information into a pre-trained comment generation model for class prediction to obtain a prediction class corresponding to each first sample comment text output by the pre-trained comment generation model; performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value; and adjusting the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, wherein the trained comment generation model is used for generating a comment text, so that a comment generation model capable of automatically generating the comment text can be obtained through training, and the comment text can be generated intelligently and quickly.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a comment generating apparatus provided in an embodiment of the present application; the comment generating means may be a computer program (including program code) running in the computer device, for example the comment generating means may be an application software in the computer device; the comment generating apparatus may be used to perform some or all of the steps in the method embodiment shown in fig. 3. Referring to fig. 6, the comment generating apparatus includes the following units:
an obtaining unit 601, configured to obtain a target music identifier and comment prompt information corresponding to the target music identifier, where the comment prompt information is used to indicate a target comment category of a generated comment text;
the processing unit 602 is configured to input the comment prompt information and the target music identifier into the trained comment generating model obtained by the comment generating model training device, and obtain a candidate comment text output by the trained comment generating model and in the target comment category;
and generating a candidate comment text output by the model according to the trained comment to obtain a target comment text, wherein the target comment text comprises music information corresponding to the target music identification.
In an embodiment, when the candidate comment text output by the model is generated according to the trained comment, and the target comment text is obtained, the processing unit 602 may specifically be configured to:
and determining the target comment text from one or more candidate comment texts output by the trained comment generation model.
In an embodiment, when determining the target comment text from the one or more candidate comment texts output by the trained comment generation model, the processing unit 602 may specifically be configured to:
inputting each candidate comment text into a trained evaluation model, and performing scoring processing on each candidate comment text to obtain an evaluation score of each candidate comment text output by the trained evaluation model;
taking the candidate comment text corresponding to the highest evaluation score as the target comment text in the target comment category; or, taking the candidate comment text with the evaluation score larger than the evaluation threshold value as the target comment text.
In an embodiment, when determining the target comment text from the one or more candidate comment texts output by the trained comment generation model, the processing unit 602 may specifically be configured to:
determining a text length of each candidate comment text;
and determining candidate comment texts with text lengths smaller than or equal to a length threshold value as the target comment text.
In one embodiment, the comment prompt message contains a target comment category; when the comment prompt information and the target music identifier are input into the trained comment generating model and the candidate comment text output by the trained comment generating model and under the target comment category is obtained, the processing unit 602 may be specifically configured to:
inputting a target comment category and the target music identification included in the comment prompt information into a trained comment generation model, and predicting a comment word at a first position to obtain a comment word corresponding to the first position in the target comment category;
predicting the comment words at a second position based on the comment words corresponding to the first position and the target music identification to obtain the comment words corresponding to the second position, wherein the second position is the next position adjacent to the first position;
and combining the comment word corresponding to the first position with the comment word corresponding to the second position to obtain a candidate comment text which is output by the trained comment generation model and is in the target comment category.
In the embodiment of the application, comment prompt information and a target music identifier are obtained, wherein the comment prompt information is used for indicating the comment category of a comment text corresponding to the generated target music identifier; inputting comment prompt information and a target music identifier into the trained comment generation model, and obtaining candidate comment texts output by the trained comment generation model and under the target comment category; according to the candidate comment text output by the trained comment generation model, the text comment (such as comment text about XX) which meets the requirement and is associated with the target music identification can be flexibly controlled and generated through the target music identification and the comment prompt information, and the problems that topics involved in the music comment text are complicated and many comment texts are irrelevant to the current music are solved.
Further, an embodiment of the present application also provides a schematic structural diagram of a computer device, and the schematic structural diagram of the computer device can be referred to in fig. 7. Running a client in the computer equipment; the computer device may include: a processor 701, an input device 702, an output device 703, and a memory 704. The processor 701, the input device 702, the output device 703, and the memory 704 are connected by a bus. The memory 704 is used to store computer programs comprising program instructions, and the processor 701 is used to execute the program instructions stored by the memory 704.
In the embodiment of the present application, the processor 701 executes the executable program code in the memory 704 to perform the following operations:
the method comprises the steps of obtaining a first sample comment text set in a music scene, wherein the first sample comment text set comprises a plurality of first sample comment texts and a category label corresponding to each first sample comment text;
adding sample comment prompt information at the sample position of each first sample comment text, wherein the sample comment prompt information is used for indicating the comment category of each first sample comment text;
inputting each first sample comment text added with sample comment prompt information into a pre-trained comment generation model for category prediction to obtain a prediction category corresponding to each first sample comment text output by the pre-trained comment generation model;
performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value;
and adjusting the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, wherein the trained comment generation model is used for generating comment texts under comment categories.
In one embodiment, the processor 701 is further configured to:
acquiring a second sample comment text set in a music scene, wherein the second sample comment text set comprises one or more second sample comment texts;
inputting each second sample comment text into an initial comment generation model, and predicting sample comment words at each position in each sample comment text by the initial comment generation model to obtain a probability value of the sample comment words corresponding to each position in each second sample comment text output by the initial comment generation model;
performing model loss calculation according to the probability value of the sample comment word corresponding to each position in each second sample comment text to obtain model loss;
and optimizing the initial comment generation model based on the model loss to obtain a pre-trained comment generation model.
In an embodiment, when the initial comment generating model predicts the sample comment words at the respective positions in each sample comment text to obtain probability values of the sample comment words corresponding to the respective positions in each second sample comment text output by the initial comment generating model, the processor 701 may be specifically configured to:
extracting embedded vectors of the sample comment words corresponding to all positions in each second sample comment text by the initial comment generation model, and sequentially performing descending position coding processing on all positions in each second sample comment text to obtain position vectors corresponding to all positions in each second sample comment text; the descending position code is used for controlling the text length of the generated comment text to be less than or equal to a length threshold value;
and respectively performing mask attention analysis on each position in each second sample comment text and the embedded vector of the sample comment word corresponding to each position by a decoding module in the initial comment generation model to obtain a probability value of the sample comment word corresponding to each position in each second sample comment text.
In the embodiment of the application, a first sample comment text set in a music scene is obtained, wherein the first sample comment text set comprises one or more first sample comment texts and a category label corresponding to each first sample comment text; adding sample comment prompt information to the sample position of each first sample comment text, wherein the sample comment prompt information is used for indicating the comment category of each first sample comment text; calling a pre-trained comment generation model, and performing category prediction on each first sample comment text added with sample comment prompt information to obtain a prediction category corresponding to each first sample comment text; performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value; and adjusting the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, wherein the trained comment generation model is used for generating a comment text, so that a comment generation model capable of automatically generating the comment text can be obtained through training, and the comment text can be generated intelligently and quickly.
Optionally, an embodiment of the present application further provides a schematic structural diagram of a computer device, and the schematic structural diagram of the computer device may refer to fig. 7, in the embodiment of the present application, the processor 701 executes the executable program code in the memory 704, and performs the following operations:
obtaining a target music identification and comment prompt information corresponding to the target music identification, wherein the comment prompt information is used for indicating a target comment category of the generated comment text;
inputting the comment prompt information and the target music identification into a trained comment generation model, and acquiring candidate comment texts output by the trained comment generation model and under the target comment category;
and obtaining a target comment text according to the candidate comment text output by the trained comment generation model, wherein the target comment text comprises music information corresponding to the target music identification.
In an embodiment, when the processor 701 generates a candidate comment text output by a model according to the trained comment to obtain a target comment text, the processor may specifically be configured to:
and determining the target comment text from one or more candidate comment texts output according to the trained comment generation model.
In an embodiment, when determining the target comment text from the one or more candidate comment texts output by the trained comment generation model, the processor 701 may be specifically configured to:
inputting each candidate comment text into a trained evaluation model, and performing scoring processing on each candidate comment text to obtain an evaluation score of each candidate comment text output by the trained evaluation model;
taking the candidate comment text corresponding to the highest evaluation score as the target comment text in the target comment category; or, taking the candidate comment text with the evaluation score larger than the evaluation threshold value as the target comment text.
In one embodiment, the processor 701, when determining the target comment text from the one or more candidate comment texts output from the trained comment generation model, may specifically be configured to:
determining a text length of each candidate comment text;
and determining candidate comment texts with text lengths smaller than or equal to a length threshold value as the target comment text.
In one embodiment, the comment prompt message contains a target comment category; the processor 701 may specifically be configured to, when the comment prompt information and the target music identifier are input into the trained comment generating model, and a candidate comment text in the target comment category output by the trained comment generating model is obtained:
inputting a target comment category and the target music identification included in the comment prompt information into a trained comment generation model, and predicting a comment word at a first position to obtain a comment word corresponding to the first position in the target comment category;
predicting the comment words at a second position based on the comment words corresponding to the first position and the target music identification to obtain the comment words corresponding to the second position, wherein the second position is the next position adjacent to the first position;
and combining the comment words corresponding to the first position and the comment words corresponding to the second position to obtain a candidate comment text which is output by the trained comment generation model and is in the target comment category.
In the embodiment of the application, comment prompt information and a target music identification are obtained, wherein the comment prompt information is used for indicating the comment category of a comment text corresponding to the generated target music identification; and then inputting the comment prompt information and the target music identification into a trained comment generation model, outputting a comment text in the target comment category according to the trained comment generation model, and flexibly controlling and generating a text comment (such as generating a comment text about XX) which meets the requirement and is associated with the target music identification through the target music identification and the comment prompt information, so that the problems that topics involved in the music comment text are relatively complicated, and many comment texts are irrelevant to the current music are solved.
Further, here, it is to be noted that: an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program includes program instructions, and when the processor executes the program instructions, the method in the embodiment corresponding to fig. 2 and fig. 3 can be executed, so that details are not repeated here. For technical details not disclosed in embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of embodiments of the method of the present application. By way of example, program instructions may be deployed to be executed on one computer device or on multiple computer devices at one site or distributed across multiple sites and interconnected by a communication network.
According to an aspect of the present application, there is provided a computer program product comprising a computer program stored in a computer readable storage medium. The processor of the computer device reads the computer program from the computer-readable storage medium, and executes the computer program, so that the computer device can perform the method in the embodiment corresponding to fig. 2 and fig. 3, and therefore, the detailed description thereof will not be repeated here.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A comment generation model training method is characterized by comprising the following steps:
the method comprises the steps of obtaining a first sample comment text set in a music scene, wherein the first sample comment text set comprises a plurality of first sample comment texts and a category label corresponding to each first sample comment text;
adding sample comment prompt information at the sample position of each first sample comment text, wherein the sample comment prompt information is used for indicating the comment category of each first sample comment text;
inputting each first sample comment text added with sample comment prompt information into a pre-trained comment generation model for category prediction to obtain a prediction category corresponding to each first sample comment text output by the pre-trained comment generation model;
performing loss calculation according to the prediction category and the category label corresponding to each first sample comment text to obtain a total loss value;
and adjusting the pre-trained comment generation model based on the total loss value to obtain a trained comment generation model, wherein the trained comment generation model is used for generating comment texts under comment categories.
2. The method of claim 1, wherein the method further comprises:
acquiring a second sample comment text set in a music scene, wherein the second sample comment text set comprises one or more second sample comment texts;
inputting each second sample comment text into an initial comment generation model, and predicting sample comment words at each position in each sample comment text by the initial comment generation model to obtain a probability value of the sample comment words corresponding to each position in each second sample comment text output by the initial comment generation model;
performing model loss calculation according to the probability value of the sample comment word corresponding to each position in each second sample comment text to obtain model loss;
and optimizing the initial comment generation model based on the model loss to obtain a pre-trained comment generation model.
3. The method of claim 2, wherein the performing, by the initial comment generation model, sample comment word prediction on each position in each sample comment text to obtain a probability value of a sample comment word corresponding to each position in each second sample comment text output by the initial comment generation model comprises:
extracting embedded vectors of the sample comment words corresponding to all positions in each second sample comment text by the initial comment generation model, and sequentially performing descending position coding processing on all positions in each second sample comment text to obtain position vectors corresponding to all positions in each second sample comment text; the descending position code is used for controlling the text length of the generated comment text to be less than or equal to a length threshold value;
and respectively performing mask attention analysis on each position in each second sample comment text and the embedded vector of the sample comment word corresponding to each position by a decoding module in the initial comment generation model to obtain a probability value of the sample comment word corresponding to each position in each second sample comment text.
4. A comment generation method characterized by comprising:
obtaining a target music identification and comment prompt information corresponding to the target music identification, wherein the comment prompt information is used for indicating a target comment category of the generated comment text;
inputting the comment prompt information and the target music identification into the trained comment generation model of any one of claims 1-3 to obtain candidate comment texts output by the trained comment generation model under the target comment category;
and generating a candidate comment text output by the model according to the trained comment to obtain a target comment text, wherein the target comment text comprises music information corresponding to the target music identification.
5. The method of claim 4, wherein generating candidate comment texts output by a model according to the trained comments to obtain target comment texts comprises:
and determining the target comment text from one or more candidate comment texts output by the trained comment generation model.
6. The method of claim 5, wherein determining the target comment text from the one or more candidate comment texts output by the trained comment generation model comprises:
inputting each candidate comment text into a trained evaluation model, and performing scoring processing on each candidate comment text to obtain an evaluation score of each candidate comment text output by the trained evaluation model;
taking the candidate comment text corresponding to the highest evaluation score as the target comment text in the target comment category; or, taking the candidate comment text with the evaluation score larger than the evaluation threshold value as the target comment text.
7. The method of claim 5, wherein determining the target comment text from the one or more candidate comment texts output by the trained comment generation model comprises:
determining a text length of each candidate comment text;
and determining candidate comment texts with text lengths smaller than or equal to a length threshold value as the target comment text.
8. The method of claim 4, wherein the comment prompt information contains a target comment category; the inputting the comment prompting information and the target music identification into the trained comment generating model of any one of claims 1 to 3 to obtain candidate comment texts output by the trained comment generating model under the target comment category includes:
inputting a target comment category and the target music identification included in the comment prompt information into a trained comment generation model, and predicting a comment word at a first position to obtain a comment word corresponding to the first position in the target comment category;
predicting the comment words at a second position based on the comment words corresponding to the first position and the target music identification to obtain the comment words corresponding to the second position, wherein the second position is the next position adjacent to the first position;
and combining the comment words corresponding to the first position and the comment words corresponding to the second position to obtain candidate comment texts under the target comment category output by the comment generation model.
9. A computer device, comprising:
a processor adapted to execute a computer program;
a computer-readable storage medium, in which a computer program is stored which, when executed by the processor, performs the method according to any one of claims 1-8.
10. A computer-readable storage medium, characterized in that it stores one or more computer programs adapted to be loaded by a processor and to perform the method according to any of claims 1-8.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211348439.4A CN115640398A (en) | 2022-10-31 | 2022-10-31 | Comment generation model training method, comment generation device and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211348439.4A CN115640398A (en) | 2022-10-31 | 2022-10-31 | Comment generation model training method, comment generation device and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN115640398A true CN115640398A (en) | 2023-01-24 |
Family
ID=84946995
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211348439.4A Pending CN115640398A (en) | 2022-10-31 | 2022-10-31 | Comment generation model training method, comment generation device and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115640398A (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116402064A (en) * | 2023-06-09 | 2023-07-07 | 北京搜狐新媒体信息技术有限公司 | Method, system, storage medium and electronic device for comment generation |
| CN116992024A (en) * | 2023-07-06 | 2023-11-03 | 平安科技(深圳)有限公司 | Review generation model training method, device, computer equipment and storage medium |
| CN117874239A (en) * | 2024-03-11 | 2024-04-12 | 腾讯科技(深圳)有限公司 | Content generation method, device, equipment and storage medium |
-
2022
- 2022-10-31 CN CN202211348439.4A patent/CN115640398A/en active Pending
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116402064A (en) * | 2023-06-09 | 2023-07-07 | 北京搜狐新媒体信息技术有限公司 | Method, system, storage medium and electronic device for comment generation |
| CN116402064B (en) * | 2023-06-09 | 2023-09-12 | 北京搜狐新媒体信息技术有限公司 | Comment generation method, comment generation system, storage medium and electronic equipment |
| CN116992024A (en) * | 2023-07-06 | 2023-11-03 | 平安科技(深圳)有限公司 | Review generation model training method, device, computer equipment and storage medium |
| CN117874239A (en) * | 2024-03-11 | 2024-04-12 | 腾讯科技(深圳)有限公司 | Content generation method, device, equipment and storage medium |
| CN117874239B (en) * | 2024-03-11 | 2024-06-11 | 腾讯科技(深圳)有限公司 | Content generation method, device, equipment and storage medium |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7662291B2 (en) | Language expression model system, pre-training method, device, equipment, and medium | |
| CN111428010B (en) | Man-machine intelligent question-answering method and device | |
| US9547471B2 (en) | Generating computer responses to social conversational inputs | |
| CN108305643B (en) | Method and device for determining emotion information | |
| CN110234018B (en) | Multimedia content description generation method, training method, device, equipment and medium | |
| CN107241260B (en) | News pushing method and device based on artificial intelligence | |
| CN115640398A (en) | Comment generation model training method, comment generation device and storage medium | |
| CN112214593A (en) | Question and answer processing method and device, electronic equipment and storage medium | |
| CN113392331A (en) | Text processing method and equipment | |
| CN113421551B (en) | Speech recognition method, speech recognition device, computer readable medium and electronic equipment | |
| CN111883131B (en) | Voice data processing method and device | |
| JP2023036574A (en) | Conversational recommendation method, method and device of training model, electronic apparatus, storage medium, and computer program | |
| CN111414512A (en) | Resource recommendation method and device based on voice search and electronic equipment | |
| CN112632242A (en) | Intelligent conversation method and device and electronic equipment | |
| CN113505198A (en) | Keyword-driven generative dialogue reply method, device and electronic device | |
| KR20200087977A (en) | Multimodal ducument summary system and method | |
| CN116913278B (en) | Voice processing method, device, equipment and storage medium | |
| CN115017886B (en) | Text matching method, text matching device, electronic device and storage medium | |
| US11804225B1 (en) | Dialog management system | |
| CN116737883A (en) | Man-machine interaction method, device, equipment and storage medium | |
| US20250117601A1 (en) | Method for information processing, electronic device, and storage medium | |
| CN115630152A (en) | Virtual character live conversation mode, device, electronic equipment and storage medium | |
| CN114422824A (en) | Data processing method, video processing method, display method and device | |
| CN118138635A (en) | Intelligent comment method and device based on emotion recognition and electronic equipment | |
| CN117649857A (en) | Zero-sample audio classification model training method and zero-sample audio classification method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |