CN112070542A - Information conversion rate prediction method, apparatus, device and readable storage medium - Google Patents
Information conversion rate prediction method, apparatus, device and readable storage medium Download PDFInfo
- Publication number
- CN112070542A CN112070542A CN202010943346.0A CN202010943346A CN112070542A CN 112070542 A CN112070542 A CN 112070542A CN 202010943346 A CN202010943346 A CN 202010943346A CN 112070542 A CN112070542 A CN 112070542A
- Authority
- CN
- China
- Prior art keywords
- conversion rate
- model
- information conversion
- rate prediction
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0202—Market predictions or forecasting for commercial activities
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0277—Online advertisement
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Physics & Mathematics (AREA)
- Game Theory and Decision Science (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域technical field
本发明涉及金融科技(Fintech)的人工智能技术领域,尤其涉及一种信息转化率预测方法、装置、设备及可读存储介质。The present invention relates to the artificial intelligence technology field of financial technology (Fintech), and in particular, to an information conversion rate prediction method, device, device and readable storage medium.
背景技术Background technique
随着金融科技,尤其是互联网科技金融的不断发展,越来越多的技术应用在金融领域,但金融业也对技术提出了更高的要求,如金融业对在线信息推广也有更高的要求。With the continuous development of financial technology, especially Internet technology finance, more and more technologies are applied in the financial field, but the financial industry also puts forward higher requirements for technology, such as the financial industry also has higher requirements for online information promotion .
推送的信息包括广告,在线广告已经成为一种常用的广告投放方式,转化率预估是在线广告精准投放的重要模块。基于机器学习算法,预估模块利用用户的基础属性信息、行为数据,分析广告内容数据等,为用户找到最合适的广告进行投放。The pushed information includes advertisements. Online advertisements have become a common way of advertising, and conversion rate estimation is an important module for accurate online advertisements. Based on the machine learning algorithm, the estimation module uses the user's basic attribute information, behavior data, and analyzes the advertisement content data to find the most suitable advertisement for the user.
现有在线广告的转化率预估,一般仅仅使用系统前期收集的用户信息,例如用户的基础属性信息,用户兴趣信息等,以及用户的转化行为数据来训练转化率预估模型。由于进行转化率预估的数据量不够丰富,进而导致广告的转化率预测准确率不高,影响广告精准投放。The conversion rate estimation of existing online advertisements generally only uses the user information collected by the system in the early stage, such as the user's basic attribute information, user interest information, etc., as well as the user's conversion behavior data to train the conversion rate estimation model. Due to the insufficient amount of data for conversion rate estimation, the accuracy of the conversion rate prediction of advertisements is not high, which affects the accurate delivery of advertisements.
发明内容SUMMARY OF THE INVENTION
本发明的主要目的在于提供一种信息转化率预测方法、装置、设备及可读存储介质,旨在解决现有在线信息的转化率预测准确率不高,影响信息精准投放的技术问题。The main purpose of the present invention is to provide an information conversion rate prediction method, device, device and readable storage medium, which aims to solve the technical problem that the conversion rate prediction accuracy of existing online information is not high, which affects the accurate delivery of information.
为实现上述目的,本发明提供一种信息转化率预测方法,所述信息转化率预测方法包括:In order to achieve the above object, the present invention provides a method for predicting information conversion rate, the method for predicting information conversion rate includes:
获取原始训练数据的用户设备号对应的扩展用户数据,并基于所述扩展用户数据对所述原始训练数据进行重新构建,得到增强型训练数据;Obtaining extended user data corresponding to the user equipment number of the original training data, and reconstructing the original training data based on the extended user data to obtain enhanced training data;
利用所述原始训练数据以及所述增强型训练数据进行联合模型训练,得到信息转化率预测模型以及辅助模型,并基于所述辅助模型对所述信息转化率预测模型进行优化;Use the original training data and the enhanced training data to perform joint model training to obtain an information conversion rate prediction model and an auxiliary model, and optimize the information conversion rate prediction model based on the auxiliary model;
在接收到信息转化率预测请求时,根据优化后的所述信息转化率预测模型得到所述信息转化率预测请求对应的信息内容的转化率预测结果。When an information conversion rate prediction request is received, a conversion rate prediction result of the information content corresponding to the information conversion rate prediction request is obtained according to the optimized information conversion rate prediction model.
进一步地,所述原始训练数据包括原始信息特征数据和用户设备号对应的原始转化率;Further, the original training data includes the original conversion rate corresponding to the original information feature data and the user equipment number;
所述获取原始训练数据的用户设备号对应的扩展用户数据,并基于所述扩展用户数据对所述原始训练数据进行重新构建,得到增强型训练数据的步骤包括:The step of obtaining the expanded user data corresponding to the user equipment number of the original training data, and reconstructing the original training data based on the expanded user data, the steps of obtaining the enhanced training data include:
在预设的用户辅助信息库中,根据所述用户设备号查找所述扩展用户数据,并在预设的用户画像数据库中,根据所述用户设备号查找用户画像数据;In the preset user assistance information database, look up the extended user data according to the user equipment number, and in the preset user portrait database, look up the user portrait data according to the user equipment number;
基于所述用户设备号,将所述扩展用户数据、所述用户画像数据、所述原始信息特征数据以及所述原始转化率关联保存作为所述增强型训练数据。Based on the user equipment number, the extended user data, the user portrait data, the original information feature data, and the original conversion rate are associated and saved as the enhanced training data.
进一步地,所述利用所述原始训练数据以及所述增强型训练数据进行联合模型训练,得到信息转化率预测模型以及辅助模型,并基于所述辅助模型对所述信息转化率预测模型进行优化的步骤包括:Further, using the original training data and the enhanced training data to perform joint model training to obtain an information conversion rate prediction model and an auxiliary model, and optimize the information conversion rate prediction model based on the auxiliary model. Steps include:
将所述原始训练数据输入初始预测模型进行模型训练,同时将所述增强型训练数据输入辅助模型进行模型训练,基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值,构造所述初始预测模型对应的第一总损失函数以及所述辅助模型的第二总损失函数;Input the original training data into the initial prediction model for model training, while inputting the enhanced training data into the auxiliary model for model training, based on the first prediction value of the initial prediction model and the second prediction value of the auxiliary model , construct the first total loss function corresponding to the initial prediction model and the second total loss function of the auxiliary model;
基于所述第一总损失函数以及所述第二总损失函数进行模型迭代训练,利用所述辅助模型的第二预测值对所述初始预测模型进行优化;Perform model iterative training based on the first total loss function and the second total loss function, and use the second predicted value of the auxiliary model to optimize the initial prediction model;
当所述初始预测模型收敛时,将当前所述初始预测模型确定为优化后的所述信息转化率预测模型。When the initial prediction model converges, the current initial prediction model is determined as the optimized information conversion rate prediction model.
进一步地,所述基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值,构造所述初始预测模型对应的第一总损失函数的步骤包括:Further, the step of constructing the first total loss function corresponding to the initial prediction model based on the first predicted value of the initial prediction model and the second predicted value of the auxiliary model includes:
基于所述初始预测模型的第一预测值以及所述原始训练数据的原始转化率,确定第一损失函数;determining a first loss function based on the first predicted value of the initial prediction model and the original conversion rate of the original training data;
基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值,确定第三损失函数;determining a third loss function based on the first predicted value of the initial prediction model and the second predicted value of the auxiliary model;
对所述第一损失函数和所述第三损失函数进行加权求和,得到所述第一总损失函数。Weighted summation is performed on the first loss function and the third loss function to obtain the first total loss function.
进一步地,所述当所述初始预测模型收敛时,将当前所述初始预测模型确定为优化后的所述信息转化率预测模型的步骤包括:Further, when the initial prediction model converges, the step of determining the current initial prediction model as the optimized information conversion rate prediction model includes:
基于所述第一总损失函数以及预设阈值确定所述初始预测模型是否收敛;determining whether the initial prediction model converges based on the first total loss function and a preset threshold;
当所述第一总损失函数小于或等于预设阈值时,确定所述初始预测模型收敛,得到所述优化后的所述信息转化率预测模型。When the first total loss function is less than or equal to a preset threshold, it is determined that the initial prediction model is converged, and the optimized prediction model of the information conversion rate is obtained.
进一步地,所述基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值,构造所述辅助模型对应的第二总损失函数的步骤包括:Further, the step of constructing a second total loss function corresponding to the auxiliary model based on the first predicted value of the initial prediction model and the second predicted value of the auxiliary model includes:
基于所述辅助模型的第二预测值以及所述原始训练数据的原始转化率,确定第二损失函数;determining a second loss function based on the second predicted value of the auxiliary model and the original conversion rate of the original training data;
对基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值确定的第三损失函数和所述第二损失函数进行加权求和,得到所述第二总损失函数。A weighted summation is performed on the third loss function and the second loss function determined based on the first predicted value of the initial prediction model and the second predicted value of the auxiliary model, to obtain the second total loss function.
进一步地,所述信息转化率预测请求包括目标用户设备号和目标信息特征数据,所述在接收到信息转化率预测请求时,根据优化后的所述信息转化率预测模型得到所述信息转化率预测请求对应的信息内容的转化率预测结果的步骤包括:Further, the information conversion rate prediction request includes the target user equipment number and target information feature data, and when the information conversion rate prediction request is received, the information conversion rate is obtained according to the optimized information conversion rate prediction model. The steps of predicting the conversion rate prediction result of the information content corresponding to the request include:
在预设的用户画像数据库中,根据所述目标用户设备号查找目标用户画像数据;In the preset user portrait database, search for target user portrait data according to the target user equipment number;
将所述目标用户画像数据、所述目标信息特征数据输入优化后的所述信息转化率预测模型进行转化率预测,将所述信息转化率预测模型输出的预测值作为所述预测转化率。Inputting the target user portrait data and the target information feature data into the optimized information conversion rate prediction model for conversion rate prediction, and using the predicted value output by the information conversion rate prediction model as the predicted conversion rate.
进一步地,所述信息转化率预测装置包括:Further, the information conversion rate prediction device includes:
构建模块,用于获取原始训练数据的用户设备号对应的扩展用户数据,并基于所述扩展用户数据对所述原始训练数据进行重新构建,得到增强型训练数据;a building module for acquiring extended user data corresponding to the user equipment number of the original training data, and reconstructing the original training data based on the extended user data to obtain enhanced training data;
训练模块,用于利用所述原始训练数据以及所述增强型训练数据进行联合模型训练,得到信息转化率预测模型以及辅助模型,并基于所述辅助模型对所述信息转化率预测模型进行优化;A training module, configured to use the original training data and the enhanced training data to perform joint model training, obtain an information conversion rate prediction model and an auxiliary model, and optimize the information conversion rate prediction model based on the auxiliary model;
预测模块,用于在接收到信息转化率预测请求时,根据优化后的所述信息转化率预测模型得到所述信息转化率预测请求对应的信息内容的转化率预测结果。The prediction module is configured to obtain, when an information conversion rate prediction request is received, a conversion rate prediction result of the information content corresponding to the information conversion rate prediction request according to the optimized information conversion rate prediction model.
为实现上述目的,本发明还提供一种信息转化率预测设备,所述设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的信息转化率预测程序,所述信息转化率预测程序被所述处理器执行时实现如上所述的信息转化率预测方法的步骤。In order to achieve the above object, the present invention also provides an information conversion rate prediction device, the device includes: a memory, a processor and an information conversion rate prediction program stored on the memory and running on the processor, When the information conversion rate prediction program is executed by the processor, the steps of the above-mentioned information conversion rate prediction method are realized.
此外,为实现上述目的,本发明还提供一种可读存储介质,所述可读存储介质上存储有信息转化率预测程序,所述信息转化率预测程序被处理器执行时实现上述任一项所述信息转化率预测方法的步骤。In addition, in order to achieve the above object, the present invention also provides a readable storage medium, on which an information conversion rate prediction program is stored, and when the information conversion rate prediction program is executed by a processor, any one of the above-mentioned items is realized. The steps of the information conversion rate prediction method.
本发明获取原始训练数据的用户设备号对应的扩展用户数据,并基于所述扩展用户数据对所述原始训练数据进行重新构建,得到增强型训练数据,而后利用所述原始训练数据以及所述增强型训练数据进行联合模型训练,得到信息转化率预测模型以及辅助模型,并基于所述辅助模型对所述信息转化率预测模型进行优化,接下来在接收到信息转化率预测请求时,根据优化后的所述信息转化率预测模型得到所述信息转化率预测请求对应的信息内容的转化率预测结果。通过在联合模型训练过程中引入基于增强型训练数据训练的辅助模型,实现信息转化率预测模型可兼容辅助模型的预测效果,从而得到高质量的信息转化率预测模型,提高了在线信息投放预测准确率,进而提升信息精准投放的成功率。The present invention obtains the extended user data corresponding to the user equipment number of the original training data, and reconstructs the original training data based on the extended user data to obtain enhanced training data, and then utilizes the original training data and the enhanced training data. The information conversion rate prediction model and the auxiliary model are obtained, and the information conversion rate prediction model is optimized based on the auxiliary model. The information conversion rate prediction model obtained by the information conversion rate prediction model obtains the conversion rate prediction result of the information content corresponding to the information conversion rate prediction request. By introducing an auxiliary model based on enhanced training data in the joint model training process, the information conversion rate prediction model can be compatible with the prediction effect of the auxiliary model, so as to obtain a high-quality information conversion rate prediction model and improve the accuracy of online information delivery prediction. rate, thereby improving the success rate of accurate information delivery.
附图说明Description of drawings
图1是本发明实施例方案涉及的硬件运行环境中设备的结构示意图;1 is a schematic structural diagram of a device in a hardware operating environment involved in an embodiment of the present invention;
图2为本发明信息转化率预测方法第一实施例的流程示意图;Fig. 2 is the schematic flow chart of the first embodiment of the information conversion rate prediction method of the present invention;
图3为本发明信息转化率预测方法第二实施例的流程示意图;3 is a schematic flowchart of the second embodiment of the information conversion rate prediction method of the present invention;
图4为本发明信息转化率预测装置实施例的功能模块示意图。FIG. 4 is a schematic diagram of functional modules of an embodiment of an information conversion rate prediction apparatus according to the present invention.
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics and advantages of the present invention will be further described with reference to the accompanying drawings in conjunction with the embodiments.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
如图1所示,图1是本发明实施例方案涉及的硬件运行环境中设备的结构示意图。As shown in FIG. 1 , FIG. 1 is a schematic structural diagram of a device in a hardware operating environment involved in an embodiment of the present invention.
如图1所示,该设备可以包括:处理器1001,例如CPU,网络接口1004,用户接口1003,存储器1005,通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1 , the device may include: a
本领域技术人员可以理解,图1中示出的设备结构并不构成对设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the device structure shown in FIG. 1 does not constitute a limitation on the device, and may include more or less components than the one shown, or combine some components, or arrange different components.
如图1所示,作为一种可读存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及信息转化率预测程序。As shown in FIG. 1 , the
在图1所示的设备中,网络接口1004主要用于连接后台服务器,与后台服务器进行数据通信;用户接口1003主要用于连接用户端,与用户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的信息转化率预测程序。In the device shown in FIG. 1 , the
在本实施例中,设备包括:存储器1005、处理器1001及存储在所述存储器1005上并可在所述处理器1001上运行的信息转化率预测程序,其中,处理器1001调用存储器1005中存储的信息转化率预测程序时,执行本申请各个实施例提供的信息转化率预测方法的步骤。In this embodiment, the device includes: a
本发明还提供一种信息转化率预测方法,参照图2,图2为本发明信息转化率预测方法第一实施例的流程示意图。The present invention also provides a method for predicting an information conversion rate. Referring to FIG. 2 , FIG. 2 is a schematic flowchart of a first embodiment of the method for predicting an information conversion rate according to the present invention.
本发明实施例提供了信息转化率预测方法的实施例,需要说明的是,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。This embodiment of the present invention provides an embodiment of a method for predicting an information conversion rate. It should be noted that although a logical sequence is shown in the flowchart, in some cases, the sequence shown here may be performed in a different order. or the described steps.
本发明的信息转化率预测方法适用于各种线上信息,如广告、电子期刊、新闻等,为了描述方便,本发明中的各个实施例中以广告为例进行说明。The information conversion rate prediction method of the present invention is applicable to various online information, such as advertisements, electronic journals, news, etc. For the convenience of description, advertisements are used as an example for description in each embodiment of the present invention.
在本实施例中,所述信息转化率预测方法包括:In this embodiment, the information conversion rate prediction method includes:
步骤S10,获取原始训练数据的用户设备号对应的扩展用户数据,并基于所述扩展用户数据对所述原始训练数据进行重新构建,得到增强型训练数据;Step S10, acquiring extended user data corresponding to the user equipment number of the original training data, and reconstructing the original training data based on the extended user data to obtain enhanced training data;
在本实施例中,在点击广告过后,用户有可能会购买广告中的商品,或下载广告中的应用等,这一行为称为转化,而预测广告投放以后用户会产生转化行为的概率称为转化率(post-clickconversionrate,cvr)预估。转化率预估是在线广告领域十分重要的一项任务,它在实现在线广告精准投放,评估广告投放价值等都具有重要应用,一个准确好用的转化率预估模型对实现广告平台和广告主双赢具有重大意义,其中广告主是指出资投放广告的一方,广告平台是指利用自身平台技术帮助广告主筛选出广告的目标用户的一方。In this embodiment, after clicking on the advertisement, the user may purchase the product in the advertisement, or download the application in the advertisement, etc. This behavior is called conversion, and the probability that the user will generate conversion behavior after the advertisement is predicted is called as Conversion rate (post-click conversion rate, cvr) estimates. Conversion rate estimation is a very important task in the field of online advertising. It has important applications in achieving accurate online advertising and evaluating the value of advertising. An accurate and easy-to-use conversion rate estimation model is very important for advertising platforms and advertisers. A win-win situation is of great significance, in which the advertiser refers to the party that invests in advertising, and the advertising platform refers to the party that uses its own platform technology to help the advertiser screen out the target users of the advertisement.
如期所述,现有在线广告的转化率预估,一般仅仅使用系统前期收集的用户信息,例如用户的基础属性信息,用户兴趣信息等,以及用户的转化行为数据来训练转化率预估模型。由于进行转化率预估的数据量不够丰富,进而导致广告的转化率预测准确率不高,影响广告精准投放。故本发明提出的广告转化率预测方法,通过基于扩展用户数据构建增强型训练数据进行联合模型训练,从而得到高质量的信息转化率预测模型,提高了在线广告投放预测准确率,从而提升广告精准投放的成功率。其中,扩展用户数据是指用户的一些背景信息,例如用户留下其企业资料后,我们可以根据资料获取大量企业的相关信息,例如企业所属行业、注册地、注册时间、企业规模等。As mentioned above, the conversion rate estimation of existing online advertisements generally only uses the user information collected by the system in the early stage, such as the user's basic attribute information, user interest information, etc., as well as the user's conversion behavior data to train the conversion rate estimation model. Due to the insufficient amount of data for conversion rate estimation, the accuracy of the conversion rate prediction of advertisements is not high, which affects the accurate delivery of advertisements. Therefore, the advertising conversion rate prediction method proposed by the present invention can obtain a high-quality information conversion rate prediction model by constructing enhanced training data based on the expanded user data for joint model training, thereby improving the prediction accuracy rate of online advertising, thereby improving the accuracy of advertising. Delivery success rate. Among them, extended user data refers to some background information of users. For example, after users leave their company information, we can obtain relevant information of a large number of enterprises based on the data, such as the industry to which the company belongs, the place of registration, the time of registration, and the scale of the company.
进一步地,原始训练数据包括用户设备号,根据原始训练数据的用户设备号,获取其对应的扩展用户数据,并利用扩展用户数据对原始训练数据进行重新构建,从而得到增强型训练数据。Further, the original training data includes a user equipment number, and the corresponding extended user data is obtained according to the user equipment number of the original training data, and the original training data is reconstructed by using the extended user data, thereby obtaining enhanced training data.
具体地,步骤S10包括:Specifically, step S10 includes:
步骤S11,在预设的用户辅助信息库中,根据所述用户设备号查找所述扩展用户数据,并在预设的用户画像数据库中,根据所述用户设备号查找用户画像数据;Step S11, in a preset user assistance information database, search for the extended user data according to the user equipment number, and in a preset user portrait database, search for user portrait data according to the user equipment number; User portrait data;
步骤S13,基于所述用户设备号,将所述扩展用户数据、所述用户画像数据、所述原始信息特征数据以及所述原始转化率关联保存作为所述增强型训练数据。Step S13: Based on the user equipment number, the extended user data, the user portrait data, the original information feature data, and the original conversion rate are associated and saved as the enhanced training data.
在本实施例中,原始训练数据包括用户设备号、原始广告特征数据和用户设备号对应的原始转化率。其中,用户设备号即手机或其他智能终端的设备码,英文缩写为:IMEI,即国际移动设备身份码,由15位数字组成。通俗的讲,设备号就是智能终端的身份证,这是出厂时就分配好的,在全世界的移动设备中是唯一的。通常情况下,一个智能终端的设备码为一个用户单独所有,故在本发明中以用户设备号为标识符,即一个用户设备号代表一个用户,根据预测的转化率就可以确定哪些用户设备号为广告的投放对象;原始广告特征数据是广告的属性特征,包括但不限于广告素材的特征、投放的上下文关键字等;用户设备号对应的原始转化率是表示用户是否购买广告中的商品,如果购买商品,则表示发生转化,如果未购买商品,则表示未发生转化,可以设原始转化率为1时,表示发生转化,原始转化率为0时,表示未发生转化。In this embodiment, the original training data includes the user equipment number, the original advertisement feature data, and the original conversion rate corresponding to the user equipment number. Among them, the user equipment number is the device code of the mobile phone or other intelligent terminal, the English abbreviation is: IMEI, that is, the International Mobile Equipment Identity Code, which is composed of 15 digits. In layman's terms, the device number is the ID card of the smart terminal, which is assigned at the factory and is unique among mobile devices in the world. Usually, the device code of a smart terminal is owned by a single user, so in the present invention, the user device number is used as an identifier, that is, a user device number represents a user, and which user device numbers can be determined according to the predicted conversion rate. is the object of advertisement; the original advertisement feature data is the attribute characteristics of the advertisement, including but not limited to the characteristics of the advertisement material, the contextual keywords of the advertisement, etc.; the original conversion rate corresponding to the user equipment number indicates whether the user purchases the product in the advertisement, If the product is purchased, it means that the conversion has occurred. If the product is not purchased, it means that the conversion has not occurred. You can set the original conversion rate to 1, which means that the conversion has occurred, and when the original conversion rate is 0, it means that the conversion has not occurred.
具体地,预设的用户画像数据库用于存储用户画像信息,包括但不限于用户的基础属性信息、兴趣特征等数据,例如用户性别、年龄、所在地区、购物偏好、阅读兴趣等。在用户画像数据库中,用户画像信息与用户设备号一一对应进行保存,故可以通过用户设备号,在预设的用户画像数据库中查找该用户设备号对应的用户画像信息。同理,用户辅助信息库用于存储用户背景信息,例如用户留下其企业资料后,我们可以根据资料获取大量企业的相关信息,例如企业所属行业、注册地、注册时间、企业规模等。在用户辅助信息库中,用户背景信息与用户设备号一一对应进行保存,故可以通过用户设备号,在预设的用户辅助信息库中查找该用户设备号对应的扩展用户数据。Specifically, the preset user portrait database is used to store user portrait information, including but not limited to the user's basic attribute information, interest characteristics and other data, such as the user's gender, age, location, shopping preferences, reading interests, and the like. In the user portrait database, the user portrait information is stored in a one-to-one correspondence with the user device number, so the user device number can be used to search for the user portrait information corresponding to the user device number in the preset user portrait database. In the same way, the user auxiliary information database is used to store user background information. For example, after users leave their company information, we can obtain a large number of relevant information of enterprises based on the data, such as the industry to which the company belongs, the place of registration, the time of registration, and the scale of the company. In the user assistance information database, the user background information and the user equipment number are stored in one-to-one correspondence, so the extended user data corresponding to the user equipment number can be searched in the preset user assistance information database through the user equipment number.
在获取到扩展用户数据、所述用户画像数据后,进行增强型训练数据构建,将每一个用户设备号对应的扩展用户数据、用户画像数据、原始广告特征数据以及原始转化率进行关联保存,从而得到增强型训练数据。After the extended user data and the user portrait data are obtained, the enhanced training data is constructed, and the expanded user data, user portrait data, original advertisement feature data and original conversion rate corresponding to each user equipment number are associated and saved, thereby Get augmented training data.
步骤S20,利用所述原始训练数据以及所述增强型训练数据进行联合模型训练,得到信息转化率预测模型以及辅助模型,并基于所述辅助模型对所述信息转化率预测模型进行优化;Step S20, using the original training data and the enhanced training data to perform joint model training to obtain an information conversion rate prediction model and an auxiliary model, and optimize the information conversion rate prediction model based on the auxiliary model;
在本实施例中,利用原始训练数据以及增强型训练数据进行联合模型训练,在训练过程中,基于损失函数,不断利用原始训练数据以及增强型训练数据优化初始预测模型和辅助模型,最终当模型收敛时,得到信息转化率预测模型。In this embodiment, the original training data and the enhanced training data are used for joint model training. During the training process, the original training data and the enhanced training data are continuously used to optimize the initial prediction model and the auxiliary model based on the loss function. When converged, the information conversion rate prediction model is obtained.
具体地,步骤S20包括:Specifically, step S20 includes:
步骤S21,将所述原始训练数据输入初始预测模型进行模型训练,同时将所述增强型训练数据输入辅助模型进行模型训练,基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值,构造所述初始预测模型对应的第一总损失函数以及所述辅助模型的第二总损失函数;Step S21, input the original training data into the initial prediction model for model training, and input the enhanced training data into the auxiliary model for model training, based on the first prediction value of the initial prediction model and the first prediction value of the auxiliary model. Two predicted values, construct the first total loss function corresponding to the initial prediction model and the second total loss function of the auxiliary model;
具体地,步骤S21包括:Specifically, step S21 includes:
步骤a,基于所述初始预测模型的第一预测值以及所述原始训练数据的原始转化率,确定第一损失函数;Step a, determining a first loss function based on the first predicted value of the initial prediction model and the original conversion rate of the original training data;
步骤b,基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值,确定第三损失函数;Step b, determining a third loss function based on the first predicted value of the initial prediction model and the second predicted value of the auxiliary model;
步骤c,对所述第一损失函数和所述第三损失函数进行加权求和,得到所述第一总损失函数。Step c, performing weighted summation on the first loss function and the third loss function to obtain the first total loss function.
步骤d,基于所述辅助模型的第二预测值以及所述原始训练数据的原始转化率,确定第二损失函数;Step d, determining a second loss function based on the second predicted value of the auxiliary model and the original conversion rate of the original training data;
步骤e,对基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值确定的第三损失函数和所述第二损失函数进行加权求和,得到所述第二总损失函数Step e: Perform weighted summation on the third loss function and the second loss function determined based on the first predicted value of the initial prediction model and the second predicted value of the auxiliary model to obtain the second total loss function
在本实施例中,损失函数(loss function)是用来估量模型的预测值与真实值的不一致程度,它是一个非负实值函数,损失函数越小,模型的鲁棒性就越好。我们训练模型的过程,就是通过不断的迭代计算,使得损失函数越来越小。损失函数越小就表示算法达到意义上的最优。In this embodiment, the loss function is used to estimate the degree of inconsistency between the predicted value of the model and the real value, and it is a non-negative real-valued function. The smaller the loss function, the better the robustness of the model. The process of training the model is to make the loss function smaller and smaller through continuous iterative calculation. The smaller the loss function, the better the algorithm is.
为了方便描述,用二元组表示原始训练数据:For the convenience of description, the original training data is represented by two-tuples:
<Xi,Yi>,(i=1,2,...,n)<X i ,Y i >,(i=1,2,...,n)
其中,Xi=(x1,x2,...,xm)表示包含用户画像数据、原始广告特征数据的m个特征;Yi表示用户行为是否发生转化,1表示是,0表示否。Among them, X i =(x 1 , x 2 ,...,x m ) represents m features including user portrait data and original advertisement feature data; Y i represents whether user behavior has been converted, 1 means yes, 0 means no .
用三元组表示增强型训练数据,用于提升从数据X预测Y的准确度:Represent augmented training data as triples to improve the accuracy of predicting Y from data X:
<(Xi,Zi),Yi>,(i=1,2,...,n)<(X i ,Z i ),Y i >,(i=1,2,...,n)
其中,Xi=(x1,x2,...,xm)表示包含用户画像数据、原始广告特征数据的m个特征;Zi=(z1,z2,...,zq)表示包含扩展用户数据的q个特征;Yi表示用户行为是否发生转化,1表示是,0表示否。Among them, X i =(x 1 ,x 2 ,...,x m ) represents m features including user portrait data and original advertisement feature data; Z i =(z 1 ,z 2 ,...,z q ) represents q features including extended user data; Y i represents whether the user behavior is transformed, 1 means yes, 0 means no.
将原始训练数据输入初始预测模型f(X)进行模型训练,同时将增强型训练数据输入辅助模型f'(X,Z)进行模型训练,利用初始预测模型的第一预测值f(Xij)以及初始辅助模型的第二预测值f'(Xij,Zij),构造损失函数。Input the original training data into the initial prediction model f(X) for model training, and input the enhanced training data into the auxiliary model f'(X,Z) for model training, and use the first prediction value f(X ij ) of the initial prediction model and the second predicted value f'(X ij , Z ij ) of the initial auxiliary model to construct a loss function.
在模型训练过程中,使用如下的优化函数:During model training, the following optimization functions are used:
其中L1为转化率预估任务的损失函数,L2为辅助任务的损失函数,这两个任务通过L3损失函数链接在一起,通过求解该优化函数,从而得到比现有方案更准确的转化率预估模型f(X)。Among them, L 1 is the loss function of the conversion rate estimation task, and L 2 is the loss function of the auxiliary task. These two tasks are linked together by the L 3 loss function. By solving the optimization function, a more accurate solution than the existing scheme is obtained. Conversion rate prediction model f(X).
具体地,同时训练初始预测模型和辅助模型,分别构造所述初始预测模型对应的第一总损失函数以及辅助模型的第二总损失函数。基于初始预测模型的第一预测值以及原始训练数据的原始转化率,确定初始预测模型的第一损失函数L1(Yij-f(Xij)),基于初始预测模型的第一预测值以及初始辅助模型的第二预测值,确定第三损失函数L3(f'(Xij,Zij),f(Xij))。对第一损失函数和第三损失函数进行加权求和,得到初始预测模型的第一总损失函数:Specifically, the initial prediction model and the auxiliary model are simultaneously trained, and the first total loss function corresponding to the initial prediction model and the second total loss function of the auxiliary model are respectively constructed. Based on the first predicted value of the initial prediction model and the original conversion rate of the original training data, a first loss function L 1 (Y ij -f(X ij )) of the initial prediction model is determined, based on the first predicted value of the initial prediction model and The second predicted value of the initial auxiliary model determines the third loss function L 3 (f'(X ij , Z ij ), f(X ij )). The weighted summation of the first loss function and the third loss function is carried out to obtain the first total loss function of the initial prediction model:
L1(Yij-f(Xij))+λL3(f'(Xij,Zij),f(Xij))L 1 (Y ij -f(X ij ))+λL 3 (f'(X ij ,Z ij ),f(X ij ))
其中,λ是一个经验值,取值范围在0~1之间,根据实际需要确定。Among them, λ is an empirical value, and the value range is between 0 and 1, which is determined according to actual needs.
同理,辅助模型的第二预测值以及所述原始训练数据的原始转化率,确定第二损失函数L2(Yij-f'(Xij,Zij)),基于初始预测模型的第一预测值以及初始辅助模型的第二预测值,确定第三损失函数L3(f'(Xij,Zij),f(Xij))。对第二损失函数和第三损失函数进行加权求和,得到辅助模型的第二总损失函数:Similarly, for the second prediction value of the auxiliary model and the original conversion rate of the original training data, a second loss function L 2 (Y ij -f'(X ij ,Z ij )) is determined, based on the first The predicted value and the second predicted value of the initial auxiliary model determine a third loss function L 3 (f'(X ij , Z ij ), f(X ij )). The weighted summation of the second loss function and the third loss function is performed to obtain the second total loss function of the auxiliary model:
L2(Yij-f(Xij))+λ'L3(f'(Xij,Zij),f(Xij))L 2 (Y ij -f(X ij ))+λ'L 3 (f'(X ij ,Z ij ),f(X ij ))
其中,λ'是一个经验值,取值范围在0~1之间,根据实际需要确定。Among them, λ' is an empirical value, and the value range is between 0 and 1, which is determined according to actual needs.
步骤S22,基于所述第一总损失函数以及所述第二总损失函数进行模型迭代训练,利用所述辅助模型的第二预测值对所述初始预测模型进行优化;Step S22, performing model iterative training based on the first total loss function and the second total loss function, and using the second predicted value of the auxiliary model to optimize the initial prediction model;
在本实施例中,分别利用第一总损失函数训练初始预测模型,利用第二总损失函数训练辅助模型。当模型不收敛时,计算得到模型的梯度信息,然后根据梯度信息更新初始预测模型的模型参数和辅助模型的模型参数,得到更新后的初始预测模型和辅助模型,利用更新后的初始预测模型及辅助模型继续进行模型训练。In this embodiment, the first total loss function is used to train the initial prediction model, and the second total loss function is used to train the auxiliary model. When the model does not converge, the gradient information of the model is obtained by calculation, and then the model parameters of the initial prediction model and the model parameters of the auxiliary model are updated according to the gradient information, and the updated initial prediction model and auxiliary model are obtained. The auxiliary model continues with model training.
需要说明的是,在模型训练过程中,第三损失函数L3(f'(Xij,Zij),f(Xij))是根据初始预测模型的第一预测值以及所述辅助模型的第二预测值确定,可表示辅助模型的第二预测值与初始预测模型的第一预测值无限接近,从而达到信息转化率预测模型可兼容辅助模型的预测效果,也就是实现了利用所述辅助模型的第二预测值对初始预测模型进行优化。It should be noted that, in the model training process, the third loss function L 3 (f'(X ij , Z ij ), f(X ij )) is based on the first predicted value of the initial prediction model and the value of the auxiliary model. The second prediction value is determined, which means that the second prediction value of the auxiliary model is infinitely close to the first prediction value of the initial prediction model, so as to achieve the prediction effect that the information conversion rate prediction model is compatible with the auxiliary model, that is, the use of the auxiliary model is realized. The second prediction value of the model optimizes the initial prediction model.
步骤S23,当所述初始预测模型收敛时,将当前所述初始预测模型确定为优化后的所述信息转化率预测模型.Step S23, when the initial prediction model converges, the current initial prediction model is determined as the optimized information conversion rate prediction model.
具体地,步骤S23包括:Specifically, step S23 includes:
步骤f,基于所述第一总损失函数以及预设阈值确定所述初始预测模型是否收敛;Step f, determining whether the initial prediction model converges based on the first total loss function and a preset threshold;
步骤g,当所述第一总损失函数小于或等于预设阈值时,确定所述初始预测模型收敛,得到所述优化后的所述信息转化率预测模型。Step g: When the first total loss function is less than or equal to a preset threshold, it is determined that the initial prediction model is converged, and the optimized prediction model of the information conversion rate is obtained.
在本实施例中,在模型训练的迭代过程中,每一轮训练后,都会将第一总损失函数值与预设阈值进行比较,根据比较结果确定模型是否收敛。当模型不收敛时,需要继续进行迭代训练,直到模型收敛,模型训练停止,此时将本轮迭代得到的初始预测模型确定为广告转化率预测模型。当总损失函数值小于或者等于预设阈值,则停止迭代。预设阈值根据实际情况确定,一般为接近0的正数,如预设值可以为0.001,经过多次迭代后第一总损失函数值会不断减小,进而接近预设阈值。还可以根据模型训练轮次确定模型是否收敛,例如预设次数可以为1000次,或者20000次,当迭代达到该预设次数时,模型停止训练,得到广告转化率预测模型。In this embodiment, in the iterative process of model training, after each round of training, the first total loss function value is compared with a preset threshold, and whether the model converges is determined according to the comparison result. When the model does not converge, iterative training needs to continue until the model converges and the model training stops. At this time, the initial prediction model obtained in this round of iteration is determined as the advertisement conversion rate prediction model. When the total loss function value is less than or equal to the preset threshold, the iteration is stopped. The preset threshold is determined according to the actual situation, and is generally a positive number close to 0. For example, the preset value may be 0.001. After multiple iterations, the first total loss function value will continue to decrease, and then approach the preset threshold. It is also possible to determine whether the model has converged according to the model training rounds. For example, the preset number of times can be 1000 or 20000 times. When the iteration reaches the preset number of times, the model stops training to obtain an advertisement conversion rate prediction model.
步骤S30,在接收到信息转化率预测请求时,根据所述信息转化率预测模型得到所述信息转化率预测请求对应的信息内容的转化率预测结果。Step S30, when receiving the information conversion rate prediction request, obtain the conversion rate prediction result of the information content corresponding to the information conversion rate prediction request according to the information conversion rate prediction model.
在本实施例中,广告转化率预测请求至少包括目标用户的用户设备号,以及待投放的广告特征数据,此时,根据目标用户的用户设备号在预设的用户画像数据库中,根据目标用户设备号查找目标用户画像数据,然后将目标用户画像数据目标广告特征数据输入信息转化率预测模型进行转化率预测,将信息转化率预测模型输出的预测值作为预测点击转化率。In this embodiment, the advertisement conversion rate prediction request includes at least the user equipment number of the target user and the feature data of the advertisement to be placed. At this time, according to the user equipment number of the target user in the preset user portrait database, The device number is used to find the target user portrait data, and then the target user portrait data and the target advertisement feature data are input into the information conversion rate prediction model to predict the conversion rate, and the predicted value output by the information conversion rate prediction model is used as the predicted click conversion rate.
本实施例提出的信息转化率预测方法,获取原始训练数据的用户设备号对应的扩展用户数据,并基于所述扩展用户数据对所述原始训练数据进行重新构建,得到增强型训练数据,而后利用所述原始训练数据以及所述增强型训练数据进行联合模型训练,得到信息转化率预测模型,接下来在接收到信息转化率预测请求时,根据所述信息转化率预测模型得到所述信息转化率预测请求对应的信息内容的转化率预测结果。通过扩展用户数据构建增强型训练数据,利用原始训练数据以及增强型训练数据进行联合模型训练,从而得到高质量的信息转化率预测模型,提高了在线信息投放预测准确率,进而提升信息精准投放的成功率。In the information conversion rate prediction method proposed in this embodiment, the extended user data corresponding to the user equipment number of the original training data is obtained, and the original training data is reconstructed based on the extended user data to obtain enhanced training data, which is then used The original training data and the enhanced training data are subjected to joint model training to obtain an information conversion rate prediction model, and then when an information conversion rate prediction request is received, the information conversion rate is obtained according to the information conversion rate prediction model. The conversion rate prediction result of the information content corresponding to the prediction request. By expanding user data to build enhanced training data, and using the original training data and enhanced training data for joint model training, a high-quality information conversion rate prediction model can be obtained, which improves the prediction accuracy of online information delivery, thereby improving the accuracy of information delivery. Success rate.
基于第一实施例,参照图3,提出本发明信息转化率预测方法的第二实施例,在本实施例中,步骤S30包括:Based on the first embodiment, referring to FIG. 3 , a second embodiment of the information conversion rate prediction method of the present invention is proposed. In this embodiment, step S30 includes:
步骤S31,在预设的用户画像数据库中,根据所述目标用户设备号查找目标用户画像数据;Step S31, in a preset user portrait database, search for target user portrait data according to the target user equipment number;
步骤S32,将所述目标用户画像数据、所述目标信息特征数据输入优化后的所述信息转化率预测模型进行转化率预测,将所述信息转化率预测模型输出的预测值作为所述预测转化率。Step S32, inputting the target user portrait data and the target information feature data into the optimized information conversion rate prediction model for conversion rate prediction, and using the predicted value output by the information conversion rate prediction model as the predicted conversion Rate.
在本实施例中,通过扩展用户数据构建增强型训练数据,利用原始训练数据以及增强型训练数据进行联合模型训练,从而得到高质量的信息转化率预测模型,进而利用信息转化率预测模型确定待投放的广告对应的精准目标人群。In this embodiment, the enhanced training data is constructed by expanding the user data, and the original training data and the enhanced training data are used for joint model training, so as to obtain a high-quality information conversion rate prediction model, and then the information conversion rate prediction model is used to determine the The precise target group corresponding to the advertisement placed.
广告转化率预测请求至少包括目标用户的用户设备号,以及待投放的广告特征数据,此时,在预设的用户画像数据库中,根据目标用户设备号查找目标用户画像数据,然后将目标用户画像数据、目标广告特征数据输入信息转化率预测模型进行转化率预测,信息转化率预测模型输出的预测值,即为转化率,当该预测值为1时,说明目标用户设备号对应的用户是该待投放的广告对应的精准目标人群,否则,当预测值为0时,说明该用户设备号对应的用户对该广告不感兴趣,购买该广告商品的可能性极低。因此,根据信息转化率预测模型的预测结果,可以指导在线广告投放给精准目标用户,提升广告精准投放的成功率。The advertisement conversion rate prediction request includes at least the user device number of the target user and the feature data of the advertisement to be placed. At this time, in the preset user portrait database, the target user portrait data is searched according to the target user device number, and then the target user portrait is searched. The data and target advertisement feature data are input into the information conversion rate prediction model to predict the conversion rate. The predicted value output by the information conversion rate prediction model is the conversion rate. When the predicted value is 1, it means that the user corresponding to the target user equipment number is the The precise target group corresponding to the advertisement to be placed, otherwise, when the predicted value is 0, it means that the user corresponding to the user equipment number is not interested in the advertisement, and the possibility of purchasing the advertised product is extremely low. Therefore, according to the prediction results of the information conversion rate prediction model, online advertisements can be guided to accurate target users, and the success rate of accurate advertisements can be improved.
本实施例提出的信息转化率预测方法,在预设的用户画像数据库中,根据所述目标用户设备号查找目标用户画像数据,而后将所述目标用户画像数据、所述目标信息特征数据输入所述信息转化率预测模型进行转化率预测,将所述信息转化率预测模型输出的预测值所谓所述预测转化率。利用原始训练数据以及增强型训练数据进行联合模型训练,得到信息转化率预测模型,并进行点击转换率的预测,提高了在线投放预测准确率,从而提升信息精准投放的成功率In the information conversion rate prediction method proposed in this embodiment, in the preset user portrait database, the target user portrait data is searched according to the target user equipment number, and then the target user portrait data and the target information feature data are input into the desired user portrait data. The information conversion rate prediction model is used to predict the conversion rate, and the predicted value output by the information conversion rate prediction model is called the predicted conversion rate. Use the original training data and the enhanced training data for joint model training, obtain the information conversion rate prediction model, and predict the click conversion rate, which improves the prediction accuracy of online delivery, thereby improving the success rate of accurate information delivery
本发明进一步提供一种信息转化率预测装置,参照图4,图4为本发明信息转化率预测装置实施例的功能模块示意图。The present invention further provides an apparatus for predicting information conversion rate. Referring to FIG. 4 , FIG. 4 is a schematic diagram of functional modules of an embodiment of the apparatus for predicting information conversion rate according to the present invention.
构建模块10,用于获取原始训练数据的用户设备号对应的扩展用户数据,并基于所述扩展用户数据对所述原始训练数据进行重新构建,得到增强型训练数据;The
训练模块20,用于利用所述原始训练数据以及所述增强型训练数据进行联合模型训练,得到信息转化率预测模型以及辅助模型,并基于所述辅助模型对所述信息转化率预测模型进行优化;The
预测模块30,用于在接收到信息转化率预测请求时,根据优化后的所述信息转化率预测模型得到所述信息转化率预测请求对应的信息内容的转化率预测结果。The
进一步地,所述构建模块10还用于:Further, the
在预设的用户辅助信息库中,根据所述用户设备号查找所述扩展用户数据,并在预设的用户画像数据库中,根据所述用户设备号查找用户画像数据;In the preset user assistance information database, look up the extended user data according to the user equipment number, and in the preset user portrait database, look up the user portrait data according to the user equipment number;
基于所述用户设备号,将所述扩展用户数据、所述用户画像数据、所述原始信息特征数据以及所述原始转化率关联保存作为所述增强型训练数据。Based on the user equipment number, the extended user data, the user portrait data, the original information feature data, and the original conversion rate are associated and saved as the enhanced training data.
进一步地,所述训练模块20还用于:Further, the
将所述原始训练数据输入初始预测模型进行模型训练,同时将所述增强型训练数据输入辅助模型进行模型训练,基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值,构造所述初始预测模型对应的第一总损失函数以及所述辅助模型的第二总损失函数;Input the original training data into the initial prediction model for model training, while inputting the enhanced training data into the auxiliary model for model training, based on the first prediction value of the initial prediction model and the second prediction value of the auxiliary model , construct the first total loss function corresponding to the initial prediction model and the second total loss function of the auxiliary model;
基于所述第一总损失函数以及所述第二总损失函数进行模型迭代训练,利用所述辅助模型的第二预测值对所述初始预测模型进行优化;Perform model iterative training based on the first total loss function and the second total loss function, and use the second predicted value of the auxiliary model to optimize the initial prediction model;
当所述初始预测模型收敛时,将当前所述初始预测模型确定为优化后的所述信息转化率预测模型。When the initial prediction model converges, the current initial prediction model is determined as the optimized information conversion rate prediction model.
进一步地,所述训练模块20还用于:Further, the
基于所述初始预测模型的第一预测值以及所述原始训练数据的原始转化率,确定第一损失函数;determining a first loss function based on the first predicted value of the initial prediction model and the original conversion rate of the original training data;
基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值,确定第三损失函数;determining a third loss function based on the first predicted value of the initial prediction model and the second predicted value of the auxiliary model;
对所述第一损失函数和所述第三损失函数进行加权求和,得到所述第一总损失函数。Weighted summation is performed on the first loss function and the third loss function to obtain the first total loss function.
进一步地,所述训练模块20还用于:Further, the
基于所述第一总损失函数以及预设阈值确定所述初始预测模型是否收敛;determining whether the initial prediction model converges based on the first total loss function and a preset threshold;
当所述第一总损失函数小于或等于预设阈值时,确定所述初始预测模型收敛,得到所述优化后的所述信息转化率预测模型。When the first total loss function is less than or equal to a preset threshold, it is determined that the initial prediction model is converged, and the optimized prediction model of the information conversion rate is obtained.
进一步地,所述训练模块20还用于:Further, the
基于所述辅助模型的第二预测值以及所述原始训练数据的原始转化率,确定第二损失函数;determining a second loss function based on the second predicted value of the auxiliary model and the original conversion rate of the original training data;
对基于所述初始预测模型的第一预测值以及所述辅助模型的第二预测值确定的第三损失函数和所述第二损失函数进行加权求和,得到所述第二总损失函数。A weighted summation is performed on the third loss function and the second loss function determined based on the first predicted value of the initial prediction model and the second predicted value of the auxiliary model, to obtain the second total loss function.
进一步地,所述预测模块30还用于:Further, the
在预设的用户画像数据库中,根据所述目标用户设备号查找目标用户画像数据;In the preset user portrait database, search for target user portrait data according to the target user equipment number;
将所述目标用户画像数据、所述目标信息特征数据输入所述信息转化率预测模型进行转化率预测,将所述信息转化率预测模型输出的预测值所谓所述预测转化率。The target user portrait data and the target information feature data are input into the information conversion rate prediction model for conversion rate prediction, and the predicted value output by the information conversion rate prediction model is called the predicted conversion rate.
此外,本发明实施例还提出一种可读存储介质,所述可读存储介质上存储有信息转化率预测程序,所述信息转化率预测程序被处理器执行时实现上述各个实施例中信息转化率预测方法的步骤。In addition, an embodiment of the present invention further provides a readable storage medium, where an information conversion rate prediction program is stored on the readable storage medium, and when the information conversion rate prediction program is executed by a processor, the information conversion in the above-mentioned embodiments is realized The steps of the rate prediction method.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or system comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or system. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article or system that includes the element.
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages or disadvantages of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台系统设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本发明各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course hardware can also be used, but in many cases the former is better implementation. Based on such understanding, the technical solutions of the present invention can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products are stored in a storage medium (such as ROM/RAM) as described above. , magnetic disk, optical disk), including several instructions to make a system device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in the various embodiments of the present invention.
以上仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。The above are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention. Any equivalent structure or equivalent process transformation made by using the contents of the description and drawings of the present invention, or directly or indirectly applied in other related technical fields , are similarly included in the scope of patent protection of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010943346.0A CN112070542B (en) | 2020-09-09 | 2020-09-09 | Information conversion rate prediction method, device, equipment and readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010943346.0A CN112070542B (en) | 2020-09-09 | 2020-09-09 | Information conversion rate prediction method, device, equipment and readable storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN112070542A true CN112070542A (en) | 2020-12-11 |
| CN112070542B CN112070542B (en) | 2024-07-12 |
Family
ID=73663256
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010943346.0A Active CN112070542B (en) | 2020-09-09 | 2020-09-09 | Information conversion rate prediction method, device, equipment and readable storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112070542B (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112905897A (en) * | 2021-03-30 | 2021-06-04 | 杭州网易云音乐科技有限公司 | Similar user determination method, vector conversion model, device, medium and equipment |
| CN112926690A (en) * | 2021-03-31 | 2021-06-08 | 北京奇艺世纪科技有限公司 | Data processing method, device, equipment and storage medium |
| CN113283948A (en) * | 2021-07-14 | 2021-08-20 | 腾讯科技(深圳)有限公司 | Generation method, device, equipment and readable medium of prediction model |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130346182A1 (en) * | 2012-06-20 | 2013-12-26 | Yahoo! Inc. | Multimedia features for click prediction of new advertisements |
| US8700465B1 (en) * | 2011-06-15 | 2014-04-15 | Google Inc. | Determining online advertisement statistics |
| CN109934619A (en) * | 2019-02-13 | 2019-06-25 | 北京三快在线科技有限公司 | User portrait label modeling method, device, electronic device and readable storage medium |
| CN111369281A (en) * | 2020-02-28 | 2020-07-03 | 深圳前海微众银行股份有限公司 | Online message processing method, apparatus, device and readable storage medium |
| CN111569429A (en) * | 2020-05-11 | 2020-08-25 | 超参数科技(深圳)有限公司 | Model training method, model using method, computer device and storage medium |
-
2020
- 2020-09-09 CN CN202010943346.0A patent/CN112070542B/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8700465B1 (en) * | 2011-06-15 | 2014-04-15 | Google Inc. | Determining online advertisement statistics |
| US20130346182A1 (en) * | 2012-06-20 | 2013-12-26 | Yahoo! Inc. | Multimedia features for click prediction of new advertisements |
| CN109934619A (en) * | 2019-02-13 | 2019-06-25 | 北京三快在线科技有限公司 | User portrait label modeling method, device, electronic device and readable storage medium |
| CN111369281A (en) * | 2020-02-28 | 2020-07-03 | 深圳前海微众银行股份有限公司 | Online message processing method, apparatus, device and readable storage medium |
| CN111569429A (en) * | 2020-05-11 | 2020-08-25 | 超参数科技(深圳)有限公司 | Model training method, model using method, computer device and storage medium |
Non-Patent Citations (2)
| Title |
|---|
| PATRICK HUMMEL ET AL.: "Loss functions for predicted click-through rates in auctions for online advertising", vol. 32, no. 7, pages 1 - 39 * |
| 陈杰浩;张钦;王树良;史继筠;赵子芊;: "基于深度置信网络的广告点击率预估的优化", 软件学报, no. 12, pages 91 - 108 * |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112905897A (en) * | 2021-03-30 | 2021-06-04 | 杭州网易云音乐科技有限公司 | Similar user determination method, vector conversion model, device, medium and equipment |
| CN112926690A (en) * | 2021-03-31 | 2021-06-08 | 北京奇艺世纪科技有限公司 | Data processing method, device, equipment and storage medium |
| CN112926690B (en) * | 2021-03-31 | 2023-09-01 | 北京奇艺世纪科技有限公司 | Data processing method, device, equipment and storage medium |
| CN113283948A (en) * | 2021-07-14 | 2021-08-20 | 腾讯科技(深圳)有限公司 | Generation method, device, equipment and readable medium of prediction model |
| CN113283948B (en) * | 2021-07-14 | 2021-10-29 | 腾讯科技(深圳)有限公司 | Generation method, device, equipment and readable medium of prediction model |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112070542B (en) | 2024-07-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112085172B (en) | Method and device for training graph neural network | |
| US11531867B2 (en) | User behavior prediction method and apparatus, and behavior prediction model training method and apparatus | |
| US10558852B2 (en) | Predictive analysis of target behaviors utilizing RNN-based user embeddings | |
| CN112085615B (en) | Training method and device for graphic neural network | |
| CN112163676B (en) | Method, device, equipment and storage medium for training multitasking service prediction model | |
| CN107644036B (en) | Method, device and system for pushing data object | |
| CN109034853B (en) | Method, device, medium and electronic equipment for searching similar users based on seed users | |
| CN113268656A (en) | User recommendation method and device, electronic equipment and computer storage medium | |
| CN111680213B (en) | Information recommendation method, data processing method and device | |
| WO2022100518A1 (en) | User profile-based object recommendation method and device | |
| CN105335409A (en) | Target user determination method and device and network server | |
| CN112070542A (en) | Information conversion rate prediction method, apparatus, device and readable storage medium | |
| CN113378033A (en) | Training method and device for recommendation model | |
| CN113139113B (en) | Search request processing method and device | |
| CN111343265A (en) | Information pushing method, device, equipment and readable storage medium | |
| US20140244641A1 (en) | Holistic customer record linkage via profile fingerprints | |
| CN115423555A (en) | Commodity recommendation method and device, electronic equipment and storage medium | |
| CN113378043B (en) | User screening method and device | |
| CN114780839A (en) | An information recommendation method, device, device and storage medium | |
| CN116308663B (en) | Recommendation method, terminal and storage medium | |
| CN113379482A (en) | Item recommendation method, computing device and storage medium | |
| CN112347147A (en) | Information pushing method and device based on user association relationship and electronic equipment | |
| CN111369281A (en) | Online message processing method, apparatus, device and readable storage medium | |
| CN115169583A (en) | Training method and device for user behavior prediction system | |
| US20160217490A1 (en) | Automatic Computation of Keyword Bids For Pay-Per-Click Advertising Campaigns and Methods and Systems Incorporating The Same |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |