[go: up one dir, main page]

CN118378076A - Information processing device, information processing method and computer readable storage medium - Google Patents

Information processing device, information processing method and computer readable storage medium Download PDF

Info

Publication number
CN118378076A
CN118378076A CN202310050865.8A CN202310050865A CN118378076A CN 118378076 A CN118378076 A CN 118378076A CN 202310050865 A CN202310050865 A CN 202310050865A CN 118378076 A CN118378076 A CN 118378076A
Authority
CN
China
Prior art keywords
model
trained
sample set
task
information processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310050865.8A
Other languages
Chinese (zh)
Inventor
冯成
钟朝亮
汪洁
孙俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to CN202310050865.8A priority Critical patent/CN118378076A/en
Priority to JP2024006142A priority patent/JP2024103463A/en
Publication of CN118378076A publication Critical patent/CN118378076A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

公开了一种信息处理装置、信息处理方法和计算机可读存储介质。该信息处理装置可以包括:样本生成单元,被配置成利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集;以及第一模型训练单元,被配置成在第一预定条件下,利用第二训练样本集对针对第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型。第一预定条件包括:预先训练的第一模型对第一模拟样本集的预测结果与再训练的第一模型对第一模拟样本集的预测结果之间的差异在预定范围内。第一任务不同于第二任务。待预测对象涉及第一任务和/或第二任务。

An information processing device, an information processing method and a computer-readable storage medium are disclosed. The information processing device may include: a sample generation unit configured to generate a first simulation sample set for a first task using a sample generation network obtained based on a second training sample set for a second task; and a first model training unit configured to retrain a first model pre-trained for the first task using the second training sample set under a first predetermined condition to obtain a retrained first model for predicting an object to be predicted. The first predetermined condition includes: the difference between the prediction result of the pre-trained first model for the first simulation sample set and the prediction result of the retrained first model for the first simulation sample set is within a predetermined range. The first task is different from the second task. The object to be predicted involves the first task and/or the second task.

Description

信息处理装置、信息处理方法及计算机可读存储介质Information processing device, information processing method and computer readable storage medium

技术领域Technical Field

本公开涉及信息处理领域,具体涉及信息处理装置、信息处理方法及计算机可读存储介质。The present disclosure relates to the field of information processing, and in particular to an information processing device, an information processing method, and a computer-readable storage medium.

背景技术Background technique

在信息处理领域中,终生(持续)学习有着重要应用。然而,在使用新的训练样本集对经训练得到的原有模型进行进一步训练的情况下,所得到的新的模型可能散失原有模型的性能。In the field of information processing, lifelong (continuous) learning has important applications. However, when a new training sample set is used to further train the original model, the new model may lose the performance of the original model.

发明内容Summary of the invention

在下文中给出了关于本公开的简要概述,以便提供关于本公开的某些方面的基本理解。但是,应当理解,这个概述并不是关于本公开的穷举性概述。它并不是意图用来确定本公开的关键性部分或重要部分,也不是意图用来限定本公开的范围。其目的仅仅是以简化的形式给出关于本公开的某些概念,以此作为稍后给出的更详细描述的前序。A brief overview of the disclosure is given below in order to provide a basic understanding of certain aspects of the disclosure. However, it should be understood that this overview is not an exhaustive overview of the disclosure. It is not intended to identify the key or important parts of the disclosure, nor is it intended to limit the scope of the disclosure. Its purpose is simply to give certain concepts of the disclosure in a simplified form as a prelude to a more detailed description given later.

鉴于以上问题,本公开的目的是提供能够解决现有技术中的一个或多个缺点的信息处理装置、信息处理方法和计算机可读存储介质。In view of the above problems, an object of the present disclosure is to provide an information processing device, an information processing method, and a computer-readable storage medium that can solve one or more disadvantages in the prior art.

根据本公开的一方面,提供了一种信息处理装置,包括:样本生成单元,被配置成利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集;以及第一模型训练单元,被配置成在第一预定条件下,利用所述第二训练样本集对针对所述第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型。所述第一预定条件包括:所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异在预定范围内。所述第一任务不同于所述第二任务。所述待预测对象涉及第一任务和/或第二任务。According to one aspect of the present disclosure, there is provided an information processing device, comprising: a sample generation unit, configured to generate a first simulation sample set for a first task using a sample generation network obtained based on a second training sample set for a second task; and a first model training unit, configured to retrain a first model pre-trained for the first task using the second training sample set under a first predetermined condition to obtain a retrained first model for predicting an object to be predicted. The first predetermined condition includes: the difference between the prediction result of the pre-trained first model for the first simulation sample set and the prediction result of the retrained first model for the first simulation sample set is within a predetermined range. The first task is different from the second task. The object to be predicted involves the first task and/or the second task.

根据本公开的另一方面,提供了一种信息处理方法,包括:利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集;以及在第一预定条件下,利用所述第二训练样本集对针对所述第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型。所述第一预定条件包括:所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异在预定范围内。所述第一任务不同于所述第二任务。所述待预测对象涉及第一任务和/或第二任务。According to another aspect of the present disclosure, there is provided an information processing method, comprising: using a sample generation network obtained based on a second training sample set for a second task to generate a first simulation sample set for a first task; and under a first predetermined condition, using the second training sample set to retrain a first model pre-trained for the first task to obtain a retrained first model for predicting an object to be predicted. The first predetermined condition comprises: the difference between the prediction result of the pre-trained first model for the first simulation sample set and the prediction result of the retrained first model for the first simulation sample set is within a predetermined range. The first task is different from the second task. The object to be predicted involves the first task and/or the second task.

根据本公开的其它方面,还提供了用于实现上述根据本公开的方法的计算机程序代码和计算机程序产品,以及其上记录有该用于实现上述根据本公开的方法的计算机程序代码的计算机可读存储介质。According to other aspects of the present disclosure, a computer program code and a computer program product for implementing the method according to the present disclosure are also provided, as well as a computer-readable storage medium having the computer program code for implementing the method according to the present disclosure recorded thereon.

在下面的说明书部分中给出本公开实施例的其它方面,其中,详细说明用于充分地公开本公开实施例的优选实施例,而不对其施加限定。Other aspects of the embodiments of the present disclosure are given in the following description, wherein the detailed description is used to fully disclose the preferred embodiments of the embodiments of the present disclosure without imposing limitations thereon.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

本公开可以通过参考下文中结合附图所给出的详细描述而得到更好的理解,其中在所有附图中使用了相同或相似的附图标记来表示相同或者相似的部件。所述附图连同下面的详细说明一起包含在本说明书中并形成说明书的一部分,用来进一步举例说明本公开的优选实施例和解释本公开的原理和优点。其中:The present disclosure may be better understood by referring to the detailed description given below in conjunction with the accompanying drawings, wherein the same or similar reference numerals are used throughout the drawings to represent the same or similar components. The accompanying drawings, together with the following detailed description, are included in and form a part of the present specification to further illustrate the preferred embodiments of the present disclosure and to explain the principles and advantages of the present disclosure. Among them:

图1是示出根据本公开的第一实施例的信息处理装置的功能配置示例的框图。FIG. 1 is a block diagram showing a functional configuration example of an information processing apparatus according to a first embodiment of the present disclosure.

图2是示出根据本公开的第二实施例的信息处理装置的功能配置示例的框图。FIG. 2 is a block diagram showing a functional configuration example of an information processing device according to a second embodiment of the present disclosure.

图3是示出根据本公开的第二实施例的样本生成网络获取单元进行的调整过程的示意图;3 is a schematic diagram showing an adjustment process performed by a sample generation network acquisition unit according to a second embodiment of the present disclosure;

图4示出第二训练样本的示例和模拟样本的示例;FIG4 shows an example of a second training sample and an example of a simulation sample;

图5是示出根据本公开的第三实施例的信息处理装置的功能配置示例的框图。FIG. 5 is a block diagram showing a functional configuration example of an information processing device according to a third embodiment of the present disclosure.

图6是示出根据本公开的第三实施例的样本生成网络获取单元进行的训练过程的示意图;6 is a schematic diagram showing a training process performed by a sample generation network acquisition unit according to a third embodiment of the present disclosure;

图7示出第二训练样本的示例和模拟样本的另外的示例;FIG7 shows an example of a second training sample and further examples of simulated samples;

图8是示出根据本公开的技术与相关技术的比较的图;FIG8 is a diagram showing a comparison between the technology according to the present disclosure and the related art;

图9是示出根据本公开的第四实施例的信息处理方法的流程示例的流程图;以及9 is a flowchart showing an example of the flow of an information processing method according to a fourth embodiment of the present disclosure; and

图10是示出作为本公开的实施例中可采用的个人计算机的示例结构的框图。FIG. 10 is a block diagram showing an example structure of a personal computer that can be employed in the embodiments of the present disclosure.

具体实施方式Detailed ways

在下文中将结合附图对本公开的示范性实施例进行描述。为了清楚和简明起见,在说明书中并未描述实际实施方式的所有特征。然而,应该了解,在开发任何这种实际实施例的过程中必须做出很多特定于实施方式的决定,以便实现开发人员的具体目标,例如,符合与系统及业务相关的那些限制条件,并且这些限制条件可能会随着实施方式的不同而有所改变。此外,还应该了解,虽然开发工作有可能是非常复杂和费时的,但对得益于本公开内容的本领域技术人员来说,这种开发工作仅仅是例行的任务。Exemplary embodiments of the present disclosure will be described below in conjunction with the accompanying drawings. For the sake of clarity and conciseness, not all features of the actual implementation are described in the specification. However, it should be understood that many implementation-specific decisions must be made in the process of developing any such actual implementation in order to achieve the developer's specific goals, such as meeting those constraints related to the system and business, and these constraints may vary from implementation to implementation. In addition, it should be understood that although the development work may be very complex and time-consuming, it is only a routine task for those skilled in the art who benefit from the content of this disclosure.

在此,还需要说明的一点是,为了避免因不必要的细节而模糊了本公开,在附图中仅仅示出了与根据本公开的方案密切相关的设备结构和/或处理步骤,而省略了与本公开关系不大的其它细节。It is also necessary to explain here that in order to avoid obscuring the present disclosure due to unnecessary details, only the device structure and/or processing steps closely related to the scheme according to the present disclosure are shown in the accompanying drawings, while other details that are not closely related to the present disclosure are omitted.

下面结合附图详细说明根据本公开的实施例。The embodiments according to the present disclosure are described in detail below with reference to the accompanying drawings.

<第一实施例><First Embodiment>

图1是示出根据本公开的第一实施例的信息处理装置100的功能配置示例的框图。如图1所示,根据本公开的第一实施例的信息处理装置100可以包括样本生成单元102和第一模型训练单元104。1 is a block diagram showing a functional configuration example of an information processing device 100 according to a first embodiment of the present disclosure. As shown in FIG1 , the information processing device 100 according to the first embodiment of the present disclosure may include a sample generation unit 102 and a first model training unit 104 .

样本生成单元102可以被配置成利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集。The sample generating unit 102 may be configured to generate a first simulation sample set for the first task using a sample generating network obtained based on a second training sample set for the second task.

第一模型训练单元104可以被配置成在第一预定条件下,利用第二训练样本集对针对第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型。例如,第一预定条件可以包括:预先训练的第一模型对第一模拟样本集的预测结果与再训练的第一模型对第一模拟样本集的预测结果之间的差异在预定范围内。预定范围可以通过有限次实验获得或根据经验设置。The first model training unit 104 may be configured to retrain the first model pre-trained for the first task using the second training sample set under the first predetermined condition to obtain a retrained first model for predicting the object to be predicted. For example, the first predetermined condition may include: the difference between the prediction result of the pre-trained first model for the first simulation sample set and the prediction result of the retrained first model for the first simulation sample set is within a predetermined range. The predetermined range may be obtained through a limited number of experiments or set based on experience.

例如,第二训练样本集可以是训练图像集、训练语音集等。相应地,基于第二训练样本集生成的第一模拟样本集可以是图像集、语音集等,并且待预测对象可以是图像、语音等。比如,图像可以包括面部图像、某种动物的图像、某种植物的图像、某种无生命的物体的图像等。For example, the second training sample set may be a training image set, a training voice set, etc. Correspondingly, the first simulation sample set generated based on the second training sample set may be an image set, a voice set, etc., and the object to be predicted may be an image, a voice, etc. For example, an image may include a facial image, an image of an animal, an image of a plant, an image of an inanimate object, etc.

第一任务可以不同于第二任务。待预测对象可以涉及第一任务和/或第二任务。例如,第一任务可以包括对第一类型集合的对象进行预测,第二任务可以包括对第二类型集合的对象进行预测。相应地,待预测对象可以包括第一类型集合的对象和/或第二类型集合的对象。比如,第一类型集合可以包括一个或更多个类型,并且第二类型集合可以包括与第一类型集合中的各个类型不同的一个或更多个类型。例如,在数字字符识别的情况下,第一类型集合的对象可以包括数字“0”和“1”,第二类型集合的对象可以包括数字“2”和“3”,但不限于此。The first task may be different from the second task. The object to be predicted may involve the first task and/or the second task. For example, the first task may include predicting objects of a first type set, and the second task may include predicting objects of a second type set. Accordingly, the object to be predicted may include objects of the first type set and/or objects of the second type set. For example, the first type set may include one or more types, and the second type set may include one or more types that are different from the individual types in the first type set. For example, in the case of digital character recognition, the objects of the first type set may include the numbers "0" and "1", and the objects of the second type set may include the numbers "2" and "3", but are not limited to this.

在本文中,“针对特定任务的样本集”可以表示用于针对特定任务而训练预定模型的样本集。比如,在特定任务是对某一类型的对象进行图像识别的情况下,针对特定任务的样本集可以是包含该类型的对象的多个图像的集合。另外,“模拟样本集”可以表示通过预定算法而非直接通过样本采集而获取的样本集。此外,在文本中,“预测”可以包括识别、分类等。In this article, "a sample set for a specific task" may refer to a sample set used to train a predetermined model for a specific task. For example, in the case where the specific task is image recognition of a certain type of object, the sample set for the specific task may be a set of multiple images containing objects of that type. In addition, a "simulated sample set" may refer to a sample set obtained by a predetermined algorithm rather than directly by sample collection. In addition, in the text, "prediction" may include recognition, classification, etc.

在相关技术中,在使用新的训练样本集对预先训练的模型进行进一步训练的情况下,所得到的再训练的模型可能散失预先训练的模型的性能。另外,由于用于获得预先训练的模型的原样本集包含的样本数量大,因此获取大量原样本的成本高,不能低成本地使用大量原样本来对再训练的模型的预测性能进行约束。在一些情况下,原样本甚至是不可知的,因而不能使用原样本来对再训练的模型的预测性能进行约束。In the related art, when a pre-trained model is further trained using a new training sample set, the resulting retrained model may lose the performance of the pre-trained model. In addition, since the original sample set used to obtain the pre-trained model contains a large number of samples, the cost of obtaining a large number of original samples is high, and a large number of original samples cannot be used at a low cost to constrain the prediction performance of the retrained model. In some cases, the original samples are even unknowable, so the original samples cannot be used to constrain the prediction performance of the retrained model.

如上所述,根据本公开的第一实施例的信息处理装置100利用基于针对第二任务的第二训练样本集而获得的样本生成网络来生成针对第一任务的第一模拟样本集,因此可以在对预先训练的第一模型进行再训练的过程中、利用第一模拟样本集对预先训练的第一模型的预测性能进行约束,以很好地保留预先训练的第一模型的预测性能,从而获得对第一任务和第二任务均有良好的预测性能的再训练的第一模型。另外,由于获取第一模拟样本集的成本比获取大量原样本的成本低得多,因此可以降低成本。此外,还可以使用第一模拟样本集来确定再训练的第一模型是否存在对第一任务的遗忘和/或对第一任务的遗忘程度。As described above, the information processing device 100 according to the first embodiment of the present disclosure uses a sample generation network obtained based on a second training sample set for a second task to generate a first simulation sample set for a first task. Therefore, in the process of retraining the pre-trained first model, the prediction performance of the pre-trained first model can be constrained by using the first simulation sample set, so as to well retain the prediction performance of the pre-trained first model, thereby obtaining a retrained first model with good prediction performance for both the first task and the second task. In addition, since the cost of obtaining the first simulation sample set is much lower than the cost of obtaining a large number of original samples, the cost can be reduced. In addition, the first simulation sample set can also be used to determine whether the retrained first model has forgotten the first task and/or the degree of forgetting the first task.

例如,原样本集可获知的情况下,也可以获取少量原样本,并且在对预先训练的第一模型进行再训练的过程中、利用所获取的少量原样本和第一模拟样本集对预先训练的第一模型的预测性能进行约束。For example, when the original sample set is known, a small amount of original samples can also be obtained, and in the process of retraining the pre-trained first model, the prediction performance of the pre-trained first model is constrained by using the obtained small amount of original samples and the first simulated sample set.

作为示例,预先训练的第一模型对第一模拟样本集的预测结果(下文中可以称为“第一预测结果”)与再训练的第一模型对第一模拟样本集的预测结果(下文中可以称为“第二预测结果”)之间的差异可以通过直接比较来获得。比如,在第一任务是对象识别的情况下,可以将第一模拟样本集中的如下模拟样本的数目M1(M1是大于0的自然数)与第一模拟样本集的总样本数目N(N是大于1的自然数)之比(M1/N)计算作为第一预测结果与第二预测结果之间的差异:预先训练的第一模型和再训练的第一模型将该模拟样本识别为不同类型的对象。As an example, the difference between the prediction result of the pre-trained first model for the first simulated sample set (hereinafter referred to as the "first prediction result") and the prediction result of the retrained first model for the first simulated sample set (hereinafter referred to as the "second prediction result") can be obtained by direct comparison. For example, in the case where the first task is object recognition, the ratio (M1/N) of the number of simulated samples M1 (M1 is a natural number greater than 0) in the first simulated sample set and the total number of samples N (N is a natural number greater than 1) in the first simulated sample set can be calculated as the difference between the first prediction result and the second prediction result: the pre-trained first model and the retrained first model recognize the simulated sample as different types of objects.

作为另外的示例,可以通过预先训练的第一模型所包括的损失层的输出向量来表征预先训练的第一模型对第一模拟样本集的预测结果与再训练的第一模型对第一模拟样本集的预测结果之间的差异。通过这种方式,可以更准确地表征第一预测结果与第二预测结果之间的差异,由此可以更好地保留预先训练的第一模型针对第一任务的预测性能。As another example, the difference between the prediction result of the pre-trained first model on the first simulation sample set and the prediction result of the retrained first model on the first simulation sample set can be represented by the output vector of the loss layer included in the pre-trained first model. In this way, the difference between the first prediction result and the second prediction result can be more accurately represented, thereby better retaining the prediction performance of the pre-trained first model for the first task.

例如,可以将第一模拟样本集中的如下样本的数目M2(M2是大于0的自然数)与第一模拟样本集的总样本数目N之比(M2/N)计算作为第一预测结果与第二预测结果之间的差异:通过预先训练的第一模型所包括的损失层所获得的针对该模拟样本的输出向量(下文中可以称为“第一输出向量”)与通过再训练的第一模型所包括的损失层所获得的针对该模拟样本的输出向量(下文中可以称为“第二输出向量”)之间的差异大于或等于第一预定阈值。例如,在这种情况下,预定范围可以是0,但不限于此,而是可以根据实际需要设置。例如,第一输出向量和第二输出向量之间的差异可以通过第一输出向量和第二输出向量之间的欧氏距离来表示。第一预定阈值可以通过有限次实验获得或者根据经验设置。For example, the ratio (M2/N) of the number of the following samples M2 (M2 is a natural number greater than 0) in the first simulation sample set to the total number of samples N in the first simulation sample set can be calculated as the difference between the first prediction result and the second prediction result: the difference between the output vector for the simulation sample obtained by the loss layer included in the pre-trained first model (hereinafter referred to as the "first output vector") and the output vector for the simulation sample obtained by the loss layer included in the retrained first model (hereinafter referred to as the "second output vector") is greater than or equal to the first predetermined threshold. For example, in this case, the predetermined range can be 0, but is not limited to this, and can be set according to actual needs. For example, the difference between the first output vector and the second output vector can be represented by the Euclidean distance between the first output vector and the second output vector. The first predetermined threshold can be obtained by a limited number of experiments or set based on experience.

例如,在通过预先训练的第一模型所包括的损失层的输出向量来表征第一预测结果与第二预测结果之间的差异的情况下,可以通过对预先训练的第一模型所对应的损失函数L添加关于上述差异的损失函数LD来对预先训练的第一模型进行再训练,以获得满足上述第一预定条件的再训练的第一模型。例如,添加了关于上述差异的损失函数LD的总损失函数L’可以由下式(1)表示:For example, in the case where the difference between the first prediction result and the second prediction result is represented by the output vector of the loss layer included in the pre-trained first model, the pre-trained first model can be retrained by adding the loss function LD about the above difference to the loss function L corresponding to the pre-trained first model to obtain the retrained first model that satisfies the above first predetermined condition. For example, the total loss function L' to which the loss function LD about the above difference is added can be expressed by the following formula (1):

L’=L+ω*λ*LD 式(1)L’=L+ω*λ*LD Formula (1)

在上式(1)中,ω用于指示在再训练过程中输入第一模型的样本的类型,在输入第一模型的样本为第二训练样本集中的训练样本的情况下,ω等于0,并且在输入第一模型的样本为第一模拟样本集中的模拟样本的情况下,ω等于1。λ>0表示权重,λ越大,表示期望再训练的第一模型针对第一任务的预测性能越接近预先训练的第一模型针对第一任务的预测性能。In the above formula (1), ω is used to indicate the type of samples input to the first model during the retraining process. When the samples input to the first model are training samples in the second training sample set, ω is equal to 0, and when the samples input to the first model are simulated samples in the first simulated sample set, ω is equal to 1. λ>0 represents a weight, and the larger λ is, the closer the prediction performance of the expected retrained first model for the first task is to the prediction performance of the pre-trained first model for the first task.

例如,关于上述差异的损失函数LD可以由下式(2)表示:For example, the loss function LD for the above difference can be expressed by the following formula (2):

在上式(2)中,A1表示在再训练的过程中第一模型的损失层的输出向量,并且A2表示预先训练的第一模型的损失层的输出向量。In the above formula (2), A1 represents the output vector of the loss layer of the first model during the retraining process, and A2 represents the output vector of the loss layer of the pre-trained first model.

例如,预先训练的第一模型可以是神经网络模型,比如,卷积神经网络(CNN)模型,但不限于此,而是本领域技术人员可以根据实际需要使用合适的模型。For example, the pre-trained first model may be a neural network model, such as a convolutional neural network (CNN) model, but is not limited thereto, and those skilled in the art may use a suitable model according to actual needs.

<第二实施例><Second Embodiment>

图2是示出根据本公开的第二实施例的信息处理装置200的功能配置示例的框图。如图2所示,根据本公开的第二实施例的信息处理装置200可以包括样本生成单元202、第一模型训练单元204和样本生成网络获取单元206。例如,样本生成单元202和第一模型训练单元204可以与上述第一实施例中的样本生成单元102和第一模型训练单元104具有类似的功能配置,下面将省略或简要描述。2 is a block diagram showing an example of a functional configuration of an information processing device 200 according to a second embodiment of the present disclosure. As shown in FIG2 , the information processing device 200 according to the second embodiment of the present disclosure may include a sample generation unit 202, a first model training unit 204, and a sample generation network acquisition unit 206. For example, the sample generation unit 202 and the first model training unit 204 may have similar functional configurations to the sample generation unit 102 and the first model training unit 104 in the first embodiment described above, and the description will be omitted or briefly described below.

样本生成网络获取单元206可以被配置成基于针对第二任务的第二训练样本集对初始的生成式对抗网络(Generative Adversarial Network)进行训练以获得经训练的生成式对抗网络。例如,在随机向量用作输入的情况下,对应的判别器不能够区分经训练的生成式对抗网络输出的模拟样本与第二训练样本集中的训练样本。The sample generation network acquisition unit 206 may be configured to train the initial generative adversarial network based on the second training sample set for the second task to obtain a trained generative adversarial network. For example, when a random vector is used as input, the corresponding discriminator cannot distinguish between the simulated samples output by the trained generative adversarial network and the training samples in the second training sample set.

在获得经训练的生成式对抗网络之后,样本生成网络获取单元206可以对经训练的生成式对抗网络进行调整,以达第二预定条件,从而获得经调整的生成式对抗网络作为样本生成网络。After obtaining the trained generative adversarial network, the sample generation network acquisition unit 206 may adjust the trained generative adversarial network to meet a second predetermined condition, thereby obtaining an adjusted generative adversarial network as the sample generation network.

第二预定条件可以包括:在利用随机向量作为经调整的生成式对抗网络的输入的情况下,预先训练的第一模型对作为经调整的生成式对抗网络的输出的第二模拟样本集的预测结果涉及针对第一任务的预测结果。例如,在第一任务包括对第一类型集合的对象进行预测的情况下,预先训练的第一模型对第二模拟样本集的预测结果涉及针对第一任务的预测结果可以表示:预先训练的第一模型将第二模拟样本集中的每个模拟样本中的各个对象预测为第一类型集合的对象。比如,在第一任务包括对数字“0”和“1”进行识别的情况下,预先训练的第一模型对第二模拟样本集的预测结果涉及针对第一任务的预测结果可以表示:预先训练的第一模型将第二模拟样本集中的每个模拟样本中的各个对象预测为“0”和“1”中的任一者。优选地,被预测为“0”的对象的数目和被预测为“1”的对象的数目接近。The second predetermined condition may include: in the case of using a random vector as the input of the adjusted generative adversarial network, the prediction result of the pre-trained first model on the second simulated sample set as the output of the adjusted generative adversarial network involves the prediction result for the first task. For example, in the case where the first task includes predicting objects of a first type set, the prediction result of the pre-trained first model on the second simulated sample set involving the prediction result for the first task can be expressed as: the pre-trained first model predicts each object in each simulated sample in the second simulated sample set as an object of the first type set. For example, in the case where the first task includes recognizing the numbers "0" and "1", the prediction result of the pre-trained first model on the second simulated sample set involving the prediction result for the first task can be expressed as: the pre-trained first model predicts each object in each simulated sample in the second simulated sample set as either "0" or "1". Preferably, the number of objects predicted to be "0" is close to the number of objects predicted to be "1".

作为示例,可以设置期望的预先训练的第一模型对第二模拟样本集中的每个模拟样本的预测结果。在这种情况下,第二预定条件还可以包括:对于第二模拟样本集中的每个模拟样本,预先训练的第一模型对该模拟样本的预测结果与期望的预测结果相同。例如,在第一任务包括对数字“0”和“1”进行识别的情况下,可以预先将第二模拟样本集中的多个模拟样本的期望预测结果设置为数字“0”,并且将其他模拟样本的期望预测结果设置为数字“1”。As an example, the expected prediction result of the pre-trained first model for each simulated sample in the second simulated sample set can be set. In this case, the second predetermined condition can also include: for each simulated sample in the second simulated sample set, the prediction result of the pre-trained first model for the simulated sample is the same as the expected prediction result. For example, in the case where the first task includes identifying the numbers "0" and "1", the expected prediction results of multiple simulated samples in the second simulated sample set can be set to the number "0" in advance, and the expected prediction results of other simulated samples can be set to the number "1".

作为另外的示例,可以针对第二模拟样本集中的每个模拟样本设置所期望的损失层的输出向量。在这种情况下,第二预定条件还可以包括:对于第二模拟样本集中的每个模拟样本,在该模拟样本用作预先训练的第一模型的输入的情况下,损失层的输出向量与所期望的损失层的输出向量之间的差异小于或等于第二预定阈值。第二预定阈值可以通过有限次实验获得或根据经验设置。As another example, the output vector of the desired loss layer may be set for each simulated sample in the second simulated sample set. In this case, the second predetermined condition may also include: for each simulated sample in the second simulated sample set, when the simulated sample is used as the input of the pre-trained first model, the difference between the output vector of the loss layer and the output vector of the desired loss layer is less than or equal to a second predetermined threshold. The second predetermined threshold may be obtained through a limited number of experiments or set based on experience.

如图3所示,可以将随机向量输入经训练的生成式对抗网络以获得模拟样本,将所获得的模拟样本输入预先训练的第一模型以获得预先训练的第一模型对该模拟样本的预测结果,并且基于预测结果对经训练的生成式对抗网络进行调整。As shown in Figure 3, a random vector can be input into a trained generative adversarial network to obtain a simulated sample, the obtained simulated sample can be input into a pre-trained first model to obtain a prediction result of the pre-trained first model for the simulated sample, and the trained generative adversarial network can be adjusted based on the prediction result.

样本生成单元202可以利用样本生成网络获取单元206所获得的样本生成网络,生成针对第一任务的第一模拟样本集。第一模型训练单元204在对预先训练的第一模型进行再训练的过程中可以利用样本生成单元202生成的第一模拟样本集对预先训练的第一模型的预测性能进行约束,从而可以更好地保留预先训练的第一模型的预测性能。另外,由于不需要获取大量的用于对预先训练的第一模型进行训练的原样本,因此可以降低成本。The sample generation unit 202 can generate a first simulation sample set for the first task using the sample generation network obtained by the sample generation network acquisition unit 206. The first model training unit 204 can use the first simulation sample set generated by the sample generation unit 202 to constrain the prediction performance of the pre-trained first model during the retraining of the pre-trained first model, so as to better retain the prediction performance of the pre-trained first model. In addition, since there is no need to obtain a large number of original samples for training the pre-trained first model, the cost can be reduced.

图4示出在第一任务包括对数字“0”和“1”进行识别并且第二任务包括对数字“2”和“3”进行识别的情况下,第二训练样本的示例和样本生成单元202所获得的模拟样本的示例。从图4可以看出,模拟样本包括数字“0”和“1”。可见,样本生成单元202所获得的模拟样本可以很好地对针对第一任务的训练样本进行模拟。FIG4 shows an example of a second training sample and an example of a simulated sample obtained by the sample generation unit 202 when the first task includes recognizing the numbers "0" and "1" and the second task includes recognizing the numbers "2" and "3". As can be seen from FIG4, the simulated sample includes the numbers "0" and "1". It can be seen that the simulated sample obtained by the sample generation unit 202 can well simulate the training sample for the first task.

作为示例,通过如下方式来获得预先训练的第一模型对第二模拟样本中的每个模拟样本的预测结果,并且将所获得的第二模拟样本中的各个模拟样本的预测结果的集合用作预先训练的第一模型对第二模拟样本集的预测结果:对该模拟样本添加噪声(例如,高斯噪声),并且获得预先训练的第一模型对添加有噪声的模拟样本的预测结果作为预先训练的第一模型对该模拟样本的预测结果。利用通过这种方式所获得的样本生成网络可以获得更好地对针对第一任务的训练样本进行模拟的模拟样本集,由此可以更好地保留针对第一任务的预测性能。As an example, the prediction result of the pre-trained first model for each simulated sample in the second simulated sample is obtained in the following manner, and the set of prediction results of each simulated sample in the obtained second simulated sample is used as the prediction result of the pre-trained first model for the second simulated sample set: noise (for example, Gaussian noise) is added to the simulated sample, and the prediction result of the pre-trained first model for the simulated sample with added noise is obtained as the prediction result of the pre-trained first model for the simulated sample. Using the sample generation network obtained in this way, a simulated sample set that better simulates the training sample for the first task can be obtained, thereby better retaining the prediction performance for the first task.

作为另外的示例,通过如下方式来获得预先训练的第一模型对第二模拟样本中的每个模拟样本的预测结果,并且将所获得的第二模拟样本中的各个模拟样本的预测结果的集合用作预先训练的第一模型对第二模拟样本集的预测结果:对利用预先训练的第一模型所包括的特征提取层提取的、该模拟样本的特征添加噪声(例如,高斯噪声),并且获得基于添加有噪声的特征的预测结果作为预先训练的第一模型对该第二模拟样本的预测结果。利用通过这种方式所获得的样本生成网络可以获得更好地对针对第一任务的训练样本进行模拟的模拟样本集,由此可以更好地保留针对第一任务的预测性能。As another example, the prediction result of the pre-trained first model for each simulated sample in the second simulated sample is obtained in the following manner, and the set of prediction results of each simulated sample in the obtained second simulated sample is used as the prediction result of the pre-trained first model for the second simulated sample set: noise (e.g., Gaussian noise) is added to the features of the simulated sample extracted by the feature extraction layer included in the pre-trained first model, and the prediction result based on the feature with added noise is obtained as the prediction result of the pre-trained first model for the second simulated sample. The sample generation network obtained in this way can be used to obtain a simulated sample set that better simulates the training samples for the first task, thereby better retaining the prediction performance for the first task.

<第三实施例><Third Embodiment>

图5是示出根据本公开的第三实施例的信息处理装置500的功能配置示例的框图。如图5所示,根据本公开的第三实施例的信息处理装置500可以包括样本生成单元502、第一模型训练单元504和样本生成网络获取单元506。例如,样本生成单元502和第一模型训练单元504可以与上述第一实施例中的样本生成单元102和第一模型训练单元104具有类似的功能配置,下面将省略或简要描述。5 is a block diagram showing an example of a functional configuration of an information processing device 500 according to the third embodiment of the present disclosure. As shown in FIG5 , the information processing device 500 according to the third embodiment of the present disclosure may include a sample generation unit 502, a first model training unit 504, and a sample generation network acquisition unit 506. For example, the sample generation unit 502 and the first model training unit 504 may have similar functional configurations to the sample generation unit 102 and the first model training unit 104 in the first embodiment described above, and the description will be omitted or briefly described below.

样本生成网络获取单元506可以被配置成:基于针对第二任务的第二训练样本集对包括一个或更多个编码层的初始生成网络进行训练,以达第三预定条件,从而获得包括一个或更多个编码层的经训练的生成网络作为样本生成网络。第三预定条件可以包括:在利用第二训练样本集作为经训练的生成网络的输入的情况下,预先训练的第一模型对作为经训练的生成网络的输出的第二模拟样本集的预测结果涉及针对第一任务的预测结果。The sample generation network acquisition unit 506 may be configured to: train the initial generation network including one or more coding layers based on the second training sample set for the second task to achieve a third predetermined condition, thereby obtaining a trained generation network including one or more coding layers as the sample generation network. The third predetermined condition may include: when the second training sample set is used as the input of the trained generation network, the prediction result of the pre-trained first model on the second simulated sample set as the output of the trained generation network involves the prediction result for the first task.

样本生成单元502可以利用样本生成网络获取单元506所获得的样本生成网络,基于第二训练样本集而生成针对第一任务的第一模拟样本集。第一模型训练单元504在对预先训练的第一模型进行再训练的过程中可以利用样本生成单元502生成的第一模拟样本集对预先训练的第一模型的预测性能进行约束,从而可以更好地保留预先训练的第一模型的预测性能。另外,由于不需要获取大量的用于对预先训练的第一模型进行训练的原样本,因此可以降低成本。The sample generation unit 502 can use the sample generation network obtained by the sample generation network acquisition unit 506 to generate a first simulation sample set for the first task based on the second training sample set. The first model training unit 504 can use the first simulation sample set generated by the sample generation unit 502 to constrain the prediction performance of the pre-trained first model during the retraining of the pre-trained first model, so as to better retain the prediction performance of the pre-trained first model. In addition, since it is not necessary to obtain a large number of original samples for training the pre-trained first model, the cost can be reduced.

作为示例,可以设置期望的预先训练的第一模型对第二模拟样本集中的每个模拟样本的预测结果。在这种情况下,第三预定条件还可以包括:对于第二模拟样本集中的每个模拟样本,预先训练的第一模型对该模拟样本的预测结果与期望的预测结果相同。As an example, the expected prediction result of the pre-trained first model for each simulated sample in the second simulated sample set can be set. In this case, the third predetermined condition can also include: for each simulated sample in the second simulated sample set, the prediction result of the pre-trained first model for the simulated sample is the same as the expected prediction result.

作为另外的示例,可以针对第二模拟样本集中的每个模拟样本设置所期望的损失层的输出向量。在这种情况下,第三预定条件还可以包括:对于第二模拟样本集中的每个模拟样本,在该模拟样本用作预先训练的第一模型的输入的情况下,损失层的输出向量与所期望的损失层的输出向量之间的差异小于或等于第三预定阈值。第三预定阈值可以通过有限次实验获得或根据经验设置。As another example, the output vector of the desired loss layer may be set for each simulated sample in the second simulated sample set. In this case, the third predetermined condition may also include: for each simulated sample in the second simulated sample set, when the simulated sample is used as the input of the pre-trained first model, the difference between the output vector of the loss layer and the output vector of the desired loss layer is less than or equal to a third predetermined threshold. The third predetermined threshold may be obtained through a limited number of experiments or set based on experience.

如图6所示,在生成网络的训练过程中,可以将第二训练样本输入生成网络以获得模拟样本,然后将所获得的模拟样本输入预先训练的第一模型以获得预先训练的第一模型对该模拟样本的实际预测结果(例如,损失层的输出向量),并且基于实际预测结果与期望预测结果(例如,所期望的损失层的输出向量)之间的差异进行梯度反向传播,以对生成网络进行训练。As shown in Figure 6, during the training process of the generative network, the second training sample can be input into the generative network to obtain a simulated sample, and then the obtained simulated sample can be input into the pre-trained first model to obtain the actual prediction result of the pre-trained first model for the simulated sample (for example, the output vector of the loss layer), and gradient back propagation is performed based on the difference between the actual prediction result and the expected prediction result (for example, the expected output vector of the loss layer) to train the generative network.

如图6所示,在一些示例中,初始生成网络还可以包括一个或更多个解码层。最后一个编码层的输出可以用作第一个解码层的输入,并且最后一个解码层的输出可以用作经训练的生成网络的输出。As shown in Figure 6, in some examples, the initial generation network may further include one or more decoding layers. The output of the last encoding layer may be used as the input of the first decoding layer, and the output of the last decoding layer may be used as the output of the trained generation network.

例如,编码层可以用于对输入进行降维,而解码层可以用于对输出进行升维。For example, the encoding layer can be used to reduce the dimensionality of the input, while the decoding layer can be used to increase the dimensionality of the output.

例如,每一个编码层可以是包括K1*K1个神经元的全连接层,每一个解码层可以是包括K2*K2个神经元的全连接层,其中,K1和K2各自是大于1的自然数。在一些示例中,K1可以等于K2。For example, each encoding layer may be a fully connected layer including K1*K1 neurons, and each decoding layer may be a fully connected layer including K2*K2 neurons, where K1 and K2 are each a natural number greater than 1. In some examples, K1 may be equal to K2.

图7示出在第一任务包括对数字“0”和“1”进行识别并且第二任务包括对数字“2”和“3”进行识别的情况下,第二训练样本的示例和样本生成单元502基于该第二训练样本所获得的模拟样本的示例。在图7所示的示例中,K1和K2均被设置为28,并且编码层的数目和解码层的数目之和为3。7 shows an example of a second training sample and an example of a simulated sample obtained by the sample generation unit 502 based on the second training sample when the first task includes recognizing the numbers "0" and "1" and the second task includes recognizing the numbers "2" and "3". In the example shown in FIG7 , both K1 and K2 are set to 28, and the sum of the number of encoding layers and the number of decoding layers is 3.

从图7可以看出,在对初始生成网络进行训练的过程中经过10000次及更多次迭代时,所获得的模拟样本可以很好地对针对第一任务的训练样本进行模拟。As can be seen from FIG7 , when the initial generation network is trained for 10,000 or more iterations, the obtained simulated samples can well simulate the training samples for the first task.

与第二实施例中的样本生成网络获取单元206类似地,样本生成网络获取单元506在对初始生成网络进行训练的过程中,在获取预先训练的第一模型对第二模拟样本集的预测结果时,可以对第二模拟样本集中的各个模拟样本或者各个模拟样本的特征添加噪声。通过这种方式,样本生成网络获取单元506所获得的样本生成网络可以获得更好地对针对第一任务的训练样本进行模拟的模拟样本集,由此可以更好地保留针对第一任务的预测性能。Similar to the sample generation network acquisition unit 206 in the second embodiment, the sample generation network acquisition unit 506 can add noise to each simulated sample or the features of each simulated sample in the second simulated sample set when acquiring the prediction result of the pre-trained first model for the second simulated sample set during the process of training the initial generation network. In this way, the sample generation network obtained by the sample generation network acquisition unit 506 can obtain a simulated sample set that better simulates the training samples for the first task, thereby better retaining the prediction performance for the first task.

图8是示出根据本公开的第三实施例的信息处理装置500与相关技术的比较的图。从图8可以看出,对于GEM(例如,参见Lopez-Paz D,Ranzato M A.Gradient episodicmemory for continual learning[J].Advances in neural information processingsystems,2017,30)、A-GEM(例如,参见Arslan Chaudhry,Marc’Aurelio Ranzato,MarcusRohrbach,and Mohamed Elhoseiny.Efficient lifelong learning with a-gem.InInternational Conference on Learning Representations,2019)和EWC(例如,参见Kirkpatrick J,Pascanu R,Rabinowitz N,et al.Overcoming catastrophic forgettingin neural networks[J].Proceedings of the national academy of sciences,2017,114(13):3521-3526),通过模拟样本集的引入,在类别递增的情况下的平均精度分别提高23.11%、25.60%和25.84%,在任务递增的情况下的平均精度分别提高1.08%、2.85%和18.62%。8 is a diagram showing a comparison between an information processing device 500 according to a third embodiment of the present disclosure and related art. As can be seen from FIG8, for GEM (e.g., see Lopez-Paz D, Ranzato M A. Gradient episodic memory for continual learning [J]. Advances in neural information processing systems, 2017, 30), A-GEM (e.g., see Arslan Chaudhry, Marc’Aurelio Ranzato, Marcus Rohrbach, and Mohamed Elhoseiny. Efficient lifelong learning with a-gem. In International Conference on Learning Representations, 2019) and EWC (e.g., see Kirkpatrick J, Pascanu R, Rabinowitz N, et al. Overcoming catastrophic forgetting in neural networks [J]. Proceedings of the national academy of sciences, 2017, 114(13): 3521-3526). By introducing the simulated sample set, the average accuracy in the case of increasing categories is improved by 23.11%, 25.60% and 25.84% respectively, and the average accuracy in the case of increasing tasks is improved by 1.08%, 2.85% and 18.62% respectively.

注意,虽然描述了根据本公开的实施例的信息处理装置的应用示例,但是该信息处理装置的应用不限于此,并且本领域技术人员可以根据实际需要将该信息处理装置应用于各种方面,这里将不再赘述。Note that although an application example of the information processing device according to an embodiment of the present disclosure is described, the application of the information processing device is not limited to this, and those skilled in the art can apply the information processing device to various aspects according to actual needs, which will not be repeated here.

<第四实施例><Fourth Embodiment>

与上述信息处理装置实施例相对应的,本公开还提供了以下信息处理方法的实施例。Corresponding to the above-mentioned information processing device embodiments, the present disclosure also provides the following information processing method embodiments.

图9是示出根据本公开的第四实施例的信息处理方法900的流程实例的流程图。如图9所示,根据本公开的第四实施例的信息处理方法900可以开始于开始步骤S902,结束于结束步骤S908,并且包括样本生成步骤S904和第一模型训练步骤S906。Fig. 9 is a flow chart showing a process example of an information processing method 900 according to the fourth embodiment of the present disclosure. As shown in Fig. 9, the information processing method 900 according to the fourth embodiment of the present disclosure may start at a start step S902, end at an end step S908, and include a sample generation step S904 and a first model training step S906.

在样本生成步骤S904中,可以利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集。例如,样本生成步骤S904可以由上文描述的样本生成单元102、202和502来执行,因此具体细节可参见上文的描述,下面仅进行简要描述。In the sample generation step S904, a sample generation network obtained based on the second training sample set for the second task may be used to generate a first simulated sample set for the first task. For example, the sample generation step S904 may be performed by the sample generation units 102, 202, and 502 described above, so the specific details may refer to the above description, and only a brief description is given below.

在第一模型训练步骤S906中,可以在第一预定条件下,利用第二训练样本集对针对第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型。例如,第一预定条件可以包括:预先训练的第一模型对第一模拟样本集的预测结果与再训练的第一模型对第一模拟样本集的预测结果之间的差异在预定范围内。通过这种方式,可以在对预先训练的第一模型进行再训练的过程中、利用第一模拟样本集对预先训练的第一模型的预测性能进行约束,以很好地保留预先训练的第一模型的预测性能,从而获得对第一任务和第二任务均有良好的预测性能的再训练的第一模型。In the first model training step S906, the first model pre-trained for the first task can be retrained using the second training sample set under the first predetermined condition to obtain a retrained first model for predicting the object to be predicted. For example, the first predetermined condition may include: the difference between the prediction result of the pre-trained first model for the first simulation sample set and the prediction result of the retrained first model for the first simulation sample set is within a predetermined range. In this way, in the process of retraining the pre-trained first model, the prediction performance of the pre-trained first model can be constrained using the first simulation sample set to well retain the prediction performance of the pre-trained first model, thereby obtaining a retrained first model with good prediction performance for both the first task and the second task.

例如,第一模型训练步骤S906可以由上文描述的第一模型训练单元104、204和504来执行,因此具体细节可参见上文的描述,下面仅进行简要描述。For example, the first model training step S906 can be performed by the first model training units 104, 204 and 504 described above, so the specific details can be found in the above description, and only a brief description is given below.

例如,预先训练的第一模型可以是神经网络模型,比如,卷积神经网络(CNN)模型,但不限于此,而是本领域技术人员可以根据实际需要使用合适的模型。For example, the pre-trained first model may be a neural network model, such as a convolutional neural network (CNN) model, but is not limited thereto, and those skilled in the art may use a suitable model according to actual needs.

例如,信息处理方法还可以包括样本生成网络获取步骤(未示出)。作为示例,在样本生成网络获取步骤中,可以基于第二训练样本集对初始的生成式对抗网络进行训练以获得经训练的生成式对抗网络,对经训练的生成式对抗网络进行调整,以达第二预定条件,从而获得经调整的生成式对抗网络作为样本生成网络。第二预定条件可以包括:在利用随机向量作为经调整的生成式对抗网络的输入的情况下,预先训练的第一模型对作为经调整的生成式对抗网络的输出的第二模拟样本集的预测结果涉及针对第一任务的预测结果。For example, the information processing method may further include a sample generation network acquisition step (not shown). As an example, in the sample generation network acquisition step, the initial generative adversarial network may be trained based on the second training sample set to obtain a trained generative adversarial network, and the trained generative adversarial network may be adjusted to reach a second predetermined condition, thereby obtaining an adjusted generative adversarial network as a sample generation network. The second predetermined condition may include: in the case of using a random vector as the input of the adjusted generative adversarial network, the prediction result of the pre-trained first model for the second simulated sample set as the output of the adjusted generative adversarial network involves the prediction result for the first task.

作为另外的示例,在样本生成网络获取步骤中,可以基于第二训练样本集对包括一个或更多个编码层的初始生成网络进行训练,以达第三预定条件,从而获得包括一个或更多个编码层的经训练的生成网络作为样本生成网络。第三预定条件可以包括:在利用第二训练样本集作为经训练的生成网络的输入的情况下,预先训练的第一模型对作为经训练的生成网络的输出的第二模拟样本集的预测结果涉及针对第一任务的预测结果。As another example, in the sample generation network acquisition step, the initial generation network including one or more coding layers can be trained based on the second training sample set to achieve a third predetermined condition, thereby obtaining a trained generation network including one or more coding layers as the sample generation network. The third predetermined condition may include: when the second training sample set is used as the input of the trained generation network, the prediction result of the pre-trained first model for the second simulated sample set as the output of the trained generation network involves the prediction result for the first task.

例如,样本生成网络获取步骤可以由上文描述的样本生成网络获取单元206和506来执行,因此具体细节可参见上文的描述,下面仅进行简要描述。For example, the sample generation network acquisition step may be performed by the sample generation network acquisition units 206 and 506 described above, so the specific details may refer to the above description, and only a brief description is given below.

作为示例,通过如下方式来获得预先训练的第一模型对第二模拟样本集的预测结果:对第二模拟样本集中的各个模拟样本添加噪声,并且获得所述预先训练的第一模型对添加有噪声的各个模拟样本的预测结果作为预先训练的第一模型对所述第二模拟样本集的预测结果。利用通过这种方式所获得的样本生成网络可以获得更好地对针对第一任务的训练样本进行模拟的模拟样本集,由此可以更好地保留预先训练的第一模型针对第一任务的预测性能。As an example, the prediction results of the pre-trained first model for the second simulated sample set are obtained in the following manner: noise is added to each simulated sample in the second simulated sample set, and the prediction results of the pre-trained first model for each simulated sample with added noise are obtained as the prediction results of the pre-trained first model for the second simulated sample set. The sample generation network obtained in this manner can be used to obtain a simulated sample set that better simulates the training samples for the first task, thereby better retaining the prediction performance of the pre-trained first model for the first task.

作为另外的示例,通过如下方式来获得预先训练的第一模型对所述第二模拟样本集的预测结果:对利用预先训练的第一模型所包括的特征提取层提取的、第二模拟样本集中的各个模拟样本的特征添加噪声,并且获得基于添加有噪声的特征的预测结果作为预先训练的第一模型对第二模拟样本集的预测结果。利用通过这种方式所获得的样本生成网络可以获得更好地对针对第一任务的训练样本进行模拟的模拟样本集,由此可以更好地保留预先训练的第一模型针对第一任务的预测性能。As another example, the prediction result of the pre-trained first model for the second simulated sample set is obtained in the following manner: adding noise to the features of each simulated sample in the second simulated sample set extracted by the feature extraction layer included in the pre-trained first model, and obtaining the prediction result based on the feature with added noise as the prediction result of the pre-trained first model for the second simulated sample set. The sample generation network obtained in this manner can be used to obtain a simulated sample set that better simulates the training samples for the first task, thereby better retaining the prediction performance of the pre-trained first model for the first task.

应指出,尽管以上描述了根据本公开的实施例的信息处理装置和信息处理方法的功能配置以及操作,但是这仅是示例而非限制,并且本领域技术人员可根据本公开的原理对以上实施例进行修改,例如可对各个实施例中的功能模块进行添加、删除或者组合等,并且这样的修改均落入本公开的范围内。It should be pointed out that although the functional configuration and operation of the information processing device and information processing method according to the embodiments of the present disclosure are described above, this is only an example and not a limitation, and those skilled in the art may modify the above embodiments according to the principles of the present disclosure, for example, the functional modules in each embodiment may be added, deleted or combined, and such modifications shall fall within the scope of the present disclosure.

此外,还应指出,这里的方法实施例是与上述装置实施例相对应的,因此在方法实施例中未详细描述的内容可参见装置实施例中相应部分的描述,在此不再重复描述。In addition, it should be pointed out that the method embodiment here corresponds to the above-mentioned device embodiment. Therefore, for the contents not described in detail in the method embodiment, please refer to the description of the corresponding parts in the device embodiment, and the description will not be repeated here.

应理解,根据本公开的实施例的存储介质和程序产品中的机器可执行的指令还可以被配置成执行上述信息处理方法,因此在此未详细描述的内容可参考先前相应部分的描述,在此不再重复进行描述。It should be understood that the machine-executable instructions in the storage medium and program products according to the embodiments of the present disclosure can also be configured to execute the above-mentioned information processing method. Therefore, the contents not described in detail here can refer to the description of the previous corresponding parts and will not be repeated here.

相应地,用于承载上述包括机器可执行的指令的程序产品的存储介质也包括在本发明的公开中。该存储介质包括但不限于软盘、光盘、磁光盘、存储卡、存储棒等等。Accordingly, the storage medium for carrying the program product including the machine executable instructions is also included in the disclosure of the present invention, including but not limited to a floppy disk, an optical disk, a magneto-optical disk, a memory card, a memory stick, and the like.

另外,还应该指出的是,上述系列处理和系统也可以通过软件和/或固件实现。在通过软件和/或固件实现的情况下,从存储介质或网络向具有专用硬件结构的计算机,例如图10所示的通用个人计算机1000安装构成该软件的程序,该计算机在安装有各种程序时,能够执行各种功能等等。In addition, it should be noted that the above series of processes and systems can also be implemented by software and/or firmware. In the case of implementation by software and/or firmware, the program constituting the software is installed from a storage medium or a network to a computer with a dedicated hardware structure, such as a general-purpose personal computer 1000 shown in FIG. 10 , and the computer can perform various functions, etc. when various programs are installed.

在图10中,中央处理单元(CPU)1001根据只读存储器(ROM)1002中存储的程序或从存储装置1008加载到随机存取存储器(RAM)1003的程序执行各种处理。在RAM 1003中,也根据需要存储当CPU 1001执行各种处理等时所需的数据。10 , a central processing unit (CPU) 1001 executes various processes according to a program stored in a read-only memory (ROM) 1002 or a program loaded from a storage device 1008 to a random access memory (RAM) 1003. In the RAM 1003, data required when the CPU 1001 executes various processes and the like is also stored as needed.

CPU 1001、ROM 1002和RAM 1003经由总线1004彼此连接。输入/输出接口1005也连接到总线1004。The CPU 1001, the ROM 1002, and the RAM 1003 are connected to each other via a bus 1004. To the bus 1004, an input/output interface 1005 is also connected.

下述部件连接到输入/输出接口1005:输入装置1006,包括键盘、鼠标等;输出装置1007,包括显示器,比如阴极射线管(CRT)、液晶显示器(LCD)等,和扬声器等;存储装置1008,包括硬盘等;和通信装置1009,包括网络接口卡比如LAN卡、调制解调器等。通信装置1009经由网络比如因特网执行通信处理。The following components are connected to the input/output interface 1005: an input device 1006 including a keyboard, a mouse, etc.; an output device 1007 including a display such as a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage device 1008 including a hard disk, etc.; and a communication device 1009 including a network interface card such as a LAN card, a modem, etc. The communication device 1009 performs communication processing via a network such as the Internet.

根据需要,驱动器1010也连接到输入/输出接口1005。可拆卸介质1011比如磁盘、光盘、磁光盘、半导体存储器等等根据需要被安装在驱动器1010上,使得从中读出的计算机程序根据需要被安装到存储装置1008中。A drive 1010 is also connected to the input/output interface 1005 as needed. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc. is mounted on the drive 1010 as needed so that a computer program read therefrom is installed into the storage device 1008 as needed.

在通过软件实现上述系列处理的情况下,从网络比如因特网或存储介质比如可拆卸介质1011安装构成软件的程序。In the case where the above-described series of processing is realized by software, a program constituting the software is installed from a network such as the Internet or a storage medium such as the removable medium 1011 .

本领域的技术人员应当理解,这种存储介质不局限于图10所示的其中存储有程序、与设备相分离地分发以向用户提供程序的可拆卸介质1011。可拆卸介质1011的例子包含磁盘(包含软盘(注册商标))、光盘(包含光盘只读存储器(CD-ROM)和数字通用盘(DVD))、磁光盘(包含迷你盘(MD)(注册商标))和半导体存储器。或者,存储介质可以是ROM1002、存储装置1008中包含的硬盘等等,其中存有程序,并且与包含它们的设备一起被分发给用户。It should be understood by those skilled in the art that such storage medium is not limited to the removable medium 1011 shown in FIG. 10 in which the program is stored and distributed separately from the device to provide the program to the user. Examples of the removable medium 1011 include magnetic disks (including floppy disks (registered trademark)), optical disks (including compact disk read-only memory (CD-ROM) and digital versatile disks (DVD)), magneto-optical disks (including minidiscs (MD) (registered trademark)), and semiconductor memories. Alternatively, the storage medium may be a ROM 1002, a hard disk included in the storage device 1008, or the like, in which the program is stored and distributed to the user together with the device containing them.

以上参照附图描述了本公开的优选实施例,但是本公开当然不限于以上示例。本领域技术人员可在所附权利要求的范围内得到各种变更和修改,并且应理解这些变更和修改自然将落入本公开的技术范围内。The preferred embodiments of the present disclosure are described above with reference to the accompanying drawings, but the present disclosure is certainly not limited to the above examples. Those skilled in the art may obtain various changes and modifications within the scope of the appended claims, and it should be understood that these changes and modifications will naturally fall within the technical scope of the present disclosure.

例如,在以上实施例中包括在一个单元中的多个功能可以由分开的装置来实现。替选地,在以上实施例中由多个单元实现的多个功能可分别由分开的装置来实现。另外,以上功能之一可由多个单元来实现。无需说,这样的配置包括在本公开的技术范围内。For example, a plurality of functions included in one unit in the above embodiments may be implemented by separate devices. Alternatively, a plurality of functions implemented by a plurality of units in the above embodiments may be implemented by separate devices, respectively. In addition, one of the above functions may be implemented by a plurality of units. Needless to say, such a configuration is included in the technical scope of the present disclosure.

在该说明书中,流程图中所描述的步骤不仅包括以所述顺序按时间序列执行的处理,而且包括并行地或单独地而不是必须按时间序列执行的处理。此外,甚至在按时间序列处理的步骤中,无需说,也可以适当地改变该顺序。In this specification, the steps described in the flowchart include not only the processing performed in time series in the order described, but also the processing performed in parallel or individually rather than necessarily in time series. In addition, even in the steps processed in time series, it goes without saying that the order can be appropriately changed.

另外,根据本公开的技术还可以如下进行配置。Additionally, the technology according to the present disclosure may also be configured as follows.

附记1.一种信息处理装置,包括:Note 1. An information processing device, comprising:

样本生成单元,被配置成利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集;以及a sample generating unit configured to generate a first simulated sample set for the first task using a sample generating network obtained based on a second training sample set for the second task; and

第一模型训练单元,被配置成在第一预定条件下,利用所述第二训练样本集对针对所述第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型,The first model training unit is configured to retrain the first model pre-trained for the first task using the second training sample set under a first predetermined condition to obtain a retrained first model for predicting the object to be predicted,

其中,所述第一预定条件包括:所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异在预定范围内,The first predetermined condition includes: the difference between the prediction result of the pre-trained first model on the first simulation sample set and the prediction result of the retrained first model on the first simulation sample set is within a predetermined range,

其中,所述第一任务不同于所述第二任务,wherein the first task is different from the second task,

其中,所述待预测对象涉及第一任务和/或第二任务。The object to be predicted involves the first task and/or the second task.

附记2.根据附记1所述的信息处理装置,还包括样本生成网络获取单元,被配置成:Note 2. The information processing device according to Note 1 further comprises a sample generation network acquisition unit configured to:

基于所述第二训练样本集对初始的生成式对抗网络进行训练以获得经训练的生成式对抗网络;以及Training an initial generative adversarial network based on the second training sample set to obtain a trained generative adversarial network; and

对所述经训练的生成式对抗网络进行调整,以达第二预定条件,从而获得经调整的生成式对抗网络作为所述样本生成网络,The trained generative adversarial network is adjusted to meet a second predetermined condition, thereby obtaining an adjusted generative adversarial network as the sample generation network,

其中,所述第二预定条件包括:在利用随机向量作为所述经调整的生成式对抗网络的输入的情况下,所述预先训练的第一模型对作为所述经调整的生成式对抗网络的输出的第二模拟样本集的预测结果涉及针对所述第一任务的预测结果。The second predetermined condition includes: when using a random vector as the input of the adjusted generative adversarial network, the prediction results of the pre-trained first model for the second simulated sample set as the output of the adjusted generative adversarial network involve the prediction results for the first task.

附记3.根据附记1所述的信息处理装置,还包括样本生成网络获取单元,被配置成:基于所述第二训练样本集对包括一个或更多个编码层的初始生成网络进行训练,以达第三预定条件,从而获得包括一个或更多个编码层的经训练的生成网络作为所述样本生成网络,Note 3. The information processing device according to Note 1 further comprises a sample generation network acquisition unit configured to: train an initial generation network including one or more coding layers based on the second training sample set to achieve a third predetermined condition, thereby obtaining a trained generation network including one or more coding layers as the sample generation network,

其中,所述第三预定条件包括:在利用所述第二训练样本集作为所述经训练的生成网络的输入的情况下,所述预先训练的第一模型对作为所述经训练的生成网络的输出的第二模拟样本集的预测结果涉及针对所述第一任务的预测结果。Among them, the third predetermined condition includes: when using the second training sample set as the input of the trained generation network, the prediction result of the pre-trained first model for the second simulated sample set as the output of the trained generation network involves the prediction result for the first task.

附记4.根据附记3所述的信息处理装置,其中,所述经训练的生成网络还包括一个或更多个解码层,所述一个或更多个编码层中的最后一个编码层的输出用作所述一个或更多个解码层的第一个解码层的输入,并且所述一个或更多个解码层的最后一个解码层的输出用作经训练的生成网络的输出。Note 4. An information processing device according to Note 3, wherein the trained generation network further includes one or more decoding layers, the output of the last encoding layer among the one or more encoding layers is used as the input of the first decoding layer of the one or more decoding layers, and the output of the last decoding layer of the one or more decoding layers is used as the output of the trained generation network.

附记5.根据附记2至4中任一项所述的信息处理装置,其中,通过如下方式来获得所述预先训练的第一模型对所述第二模拟样本集的预测结果:对所述第二模拟样本集中的各个模拟样本添加噪声,并且获得所述预先训练的第一模型对添加有噪声的各个模拟样本的预测结果作为所述预先训练的第一模型对所述第二模拟样本集的预测结果。Note 5. An information processing device according to any one of Notes 2 to 4, wherein the prediction result of the pre-trained first model for the second simulated sample set is obtained by adding noise to each simulated sample in the second simulated sample set, and obtaining the prediction result of the pre-trained first model for each simulated sample with added noise as the prediction result of the pre-trained first model for the second simulated sample set.

附记6.根据附记2至4中任一项所述的信息处理装置,其中,通过如下方式来获得所述预先训练的第一模型对所述第二模拟样本集的预测结果:对利用所述预先训练的第一模型所包括的特征提取层提取的、所述第二模拟样本集中的各个模拟样本的特征添加噪声,并且获得基于添加有噪声的特征的预测结果作为所述预先训练的第一模型对所述第二模拟样本集的预测结果。Note 6. An information processing device according to any one of Notes 2 to 4, wherein the prediction result of the pre-trained first model for the second simulated sample set is obtained by adding noise to the features of each simulated sample in the second simulated sample set extracted by the feature extraction layer included in the pre-trained first model, and obtaining the prediction result based on the feature with added noise as the prediction result of the pre-trained first model for the second simulated sample set.

附记7.根据附记1至4中任一项所述的信息处理装置,其中,通过所述预先训练的第一模型所包括的损失层的输出向量来表征所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异。Note 7. An information processing device according to any one of Notes 1 to 4, wherein the difference between the prediction results of the pre-trained first model for the first simulated sample set and the prediction results of the retrained first model for the first simulated sample set is represented by the output vector of the loss layer included in the pre-trained first model.

附记8.根据附记1至4中任一项所述的信息处理装置,其中,所述预先训练的第一模型是神经网络模型。Note 8. An information processing device according to any one of Notes 1 to 4, wherein the pre-trained first model is a neural network model.

附记9.一种信息处理方法,包括:Note 9. An information processing method comprising:

利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集;以及generating a first simulated sample set for the first task using a sample generation network obtained based on a second training sample set for the second task; and

在第一预定条件下,利用所述第二训练样本集对针对所述第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型,Under a first predetermined condition, retraining a first model pre-trained for the first task using the second training sample set to obtain a retrained first model for predicting the object to be predicted,

其中,所述第一预定条件包括:所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异在预定范围内,The first predetermined condition includes: the difference between the prediction result of the pre-trained first model on the first simulation sample set and the prediction result of the retrained first model on the first simulation sample set is within a predetermined range,

其中,所述第一任务不同于所述第二任务,wherein the first task is different from the second task,

其中,所述待预测对象涉及第一任务和/或第二任务。The object to be predicted involves the first task and/or the second task.

附记10.根据附记9所述的信息处理方法,还包括:Note 10. The information processing method according to Note 9 further comprises:

基于所述第二训练样本集对初始的生成式对抗网络进行训练以获得经训练的生成式对抗网络;以及Training an initial generative adversarial network based on the second training sample set to obtain a trained generative adversarial network; and

对所述经训练的生成式对抗网络进行调整,以达第二预定条件,从而获得经调整的生成式对抗网络作为所述样本生成网络,The trained generative adversarial network is adjusted to meet a second predetermined condition, thereby obtaining an adjusted generative adversarial network as the sample generation network,

其中,所述第二预定条件包括:在利用随机向量作为所述经调整的生成式对抗网络的输入的情况下,所述预先训练的第一模型对作为所述经调整的生成式对抗网络的输出的第二模拟样本集的预测结果涉及针对所述第一任务的预测结果。The second predetermined condition includes: when using a random vector as the input of the adjusted generative adversarial network, the prediction results of the pre-trained first model for the second simulated sample set as the output of the adjusted generative adversarial network involve the prediction results for the first task.

附记11.根据附记9所述的信息处理方法,还包括:基于所述第二训练样本集对包括一个或更多个编码层的初始生成网络进行训练,以达第三预定条件,从而获得包括一个或更多个编码层的经训练的生成网络作为所述样本生成网络,Note 11. The information processing method according to Note 9 further comprises: training an initial generation network including one or more coding layers based on the second training sample set to achieve a third predetermined condition, thereby obtaining a trained generation network including one or more coding layers as the sample generation network,

其中,所述第三预定条件包括:在利用所述第二训练样本集作为所述经训练的生成网络的输入的情况下,所述预先训练的第一模型对作为所述经训练的生成网络的输出的第二模拟样本集的预测结果涉及针对所述第一任务的预测结果。Among them, the third predetermined condition includes: when using the second training sample set as the input of the trained generation network, the prediction result of the pre-trained first model for the second simulated sample set as the output of the trained generation network involves the prediction result for the first task.

附记12.根据附记11所述的信息处理方法,其中,所述经训练的生成网络还包括一个或更多个解码层,所述一个或更多个编码层中的最后一个编码层的输出用作所述一个或更多个解码层的第一个解码层的输入,并且所述一个或更多个解码层的最后一个解码层的输出用作经训练的生成网络的输出。Note 12. An information processing method according to Note 11, wherein the trained generation network further includes one or more decoding layers, the output of the last encoding layer among the one or more encoding layers is used as the input of the first decoding layer of the one or more decoding layers, and the output of the last decoding layer of the one or more decoding layers is used as the output of the trained generation network.

附记13.根据附记10至12中任一项所述的信息处理方法,其中,通过如下方式来获得所述预先训练的第一模型对所述第二模拟样本集的预测结果:对所述第二模拟样本集中的各个模拟样本添加噪声,并且获得所述预先训练的第一模型对添加有噪声的各个模拟样本的预测结果作为所述预先训练的第一模型对所述第二模拟样本集的预测结果。Note 13. An information processing method according to any one of Notes 10 to 12, wherein the prediction result of the pre-trained first model for the second simulated sample set is obtained by adding noise to each simulated sample in the second simulated sample set, and obtaining the prediction result of the pre-trained first model for each simulated sample with added noise as the prediction result of the pre-trained first model for the second simulated sample set.

附记14.根据附记10至12中任一项所述的信息处理方法,其中,通过如下方式来获得所述预先训练的第一模型对所述第二模拟样本集的预测结果:对利用所述预先训练的第一模型所包括的特征提取层提取的、所述第二模拟样本集中的各个模拟样本的特征添加噪声,并且获得基于添加有噪声的特征的预测结果作为所述预先训练的第一模型对所述第二模拟样本集的预测结果。Note 14. An information processing method according to any one of Notes 10 to 12, wherein the prediction result of the pre-trained first model for the second simulated sample set is obtained by adding noise to the features of each simulated sample in the second simulated sample set extracted by the feature extraction layer included in the pre-trained first model, and obtaining the prediction result based on the feature with added noise as the prediction result of the pre-trained first model for the second simulated sample set.

附记15.根据附记9至12中任一项所述的信息处理方法,其中,通过所述预先训练的第一模型所包括的损失层的输出向量来表征所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异。Note 15. An information processing method according to any one of Notes 9 to 12, wherein the difference between the prediction results of the pre-trained first model for the first simulated sample set and the prediction results of the retrained first model for the first simulated sample set is represented by the output vector of the loss layer included in the pre-trained first model.

附记16.根据附记9至12中任一项所述的信息处理方法,其中,所述预先训练的第一模型是神经网络模型。Note 16. An information processing method according to any one of Notes 9 to 12, wherein the pre-trained first model is a neural network model.

附记17.一种存储有程序的计算机可读存储介质,所述程序在由计算机执行时使得所述计算机执行根据附记9至16中任一项所述的信息处理方法。Note 17. A computer-readable storage medium storing a program, wherein when the program is executed by a computer, the computer executes the information processing method according to any one of Notes 9 to 16.

Claims (10)

1.一种信息处理装置,包括:1. An information processing device, comprising: 样本生成单元,被配置成利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集;以及a sample generating unit configured to generate a first simulated sample set for the first task using a sample generating network obtained based on a second training sample set for the second task; and 第一模型训练单元,被配置成在第一预定条件下,利用所述第二训练样本集对针对所述第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型,The first model training unit is configured to retrain the first model pre-trained for the first task using the second training sample set under a first predetermined condition to obtain a retrained first model for predicting the object to be predicted, 其中,所述第一预定条件包括:所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异在预定范围内,The first predetermined condition includes: the difference between the prediction result of the pre-trained first model on the first simulation sample set and the prediction result of the retrained first model on the first simulation sample set is within a predetermined range, 其中,所述第一任务不同于所述第二任务,wherein the first task is different from the second task, 其中,所述待预测对象涉及第一任务和/或第二任务。The object to be predicted involves the first task and/or the second task. 2.根据权利要求1所述的信息处理装置,还包括样本生成网络获取单元,被配置成:2. The information processing device according to claim 1, further comprising a sample generation network acquisition unit configured to: 基于所述第二训练样本集对初始的生成式对抗网络进行训练以获得经训练的生成式对抗网络;以及Training an initial generative adversarial network based on the second training sample set to obtain a trained generative adversarial network; and 对所述经训练的生成式对抗网络进行调整,以达第二预定条件,从而获得经调整的生成式对抗网络作为所述样本生成网络,The trained generative adversarial network is adjusted to meet a second predetermined condition, thereby obtaining an adjusted generative adversarial network as the sample generation network, 其中,所述第二预定条件包括:在利用随机向量作为所述经调整的生成式对抗网络的输入的情况下,所述预先训练的第一模型对作为所述经调整的生成式对抗网络的输出的第二模拟样本集的预测结果涉及针对所述第一任务的预测结果。The second predetermined condition includes: when using a random vector as the input of the adjusted generative adversarial network, the prediction results of the pre-trained first model for the second simulated sample set as the output of the adjusted generative adversarial network involve the prediction results for the first task. 3.根据权利要求1所述的信息处理装置,还包括样本生成网络获取单元,被配置成:基于所述第二训练样本集对包括一个或更多个编码层的初始生成网络进行训练,以达第三预定条件,从而获得包括一个或更多个编码层的经训练的生成网络作为所述样本生成网络,3. The information processing device according to claim 1, further comprising a sample generation network acquisition unit, configured to: train an initial generation network including one or more coding layers based on the second training sample set to achieve a third predetermined condition, thereby obtaining a trained generation network including one or more coding layers as the sample generation network, 其中,所述第三预定条件包括:在利用所述第二训练样本集作为所述经训练的生成网络的输入的情况下,所述预先训练的第一模型对作为所述经训练的生成网络的输出的第二模拟样本集的预测结果涉及针对所述第一任务的预测结果。Among them, the third predetermined condition includes: when using the second training sample set as the input of the trained generation network, the prediction result of the pre-trained first model for the second simulated sample set as the output of the trained generation network involves the prediction result for the first task. 4.根据权利要求3所述的信息处理装置,其中,所述经训练的生成网络还包括一个或更多个解码层,所述一个或更多个编码层中的最后一个编码层的输出用作所述一个或更多个解码层的第一个解码层的输入,并且所述一个或更多个解码层的最后一个解码层的输出用作经训练的生成网络的输出。4. The information processing device according to claim 3, wherein the trained generation network further includes one or more decoding layers, the output of the last encoding layer of the one or more encoding layers is used as the input of the first decoding layer of the one or more decoding layers, and the output of the last decoding layer of the one or more decoding layers is used as the output of the trained generation network. 5.根据权利要求2至4中任一项所述的信息处理装置,其中,通过如下方式来获得所述预先训练的第一模型对所述第二模拟样本集的预测结果:对所述第二模拟样本集中的各个模拟样本添加噪声,并且获得所述预先训练的第一模型对添加有噪声的各个模拟样本的预测结果作为所述预先训练的第一模型对所述第二模拟样本集的预测结果。5. An information processing device according to any one of claims 2 to 4, wherein the prediction result of the pre-trained first model for the second simulated sample set is obtained by adding noise to each simulated sample in the second simulated sample set, and obtaining the prediction result of the pre-trained first model for each simulated sample with added noise as the prediction result of the pre-trained first model for the second simulated sample set. 6.根据权利要求2至4中任一项所述的信息处理装置,其中,通过如下方式来获得所述预先训练的第一模型对所述第二模拟样本集的预测结果:对利用所述预先训练的第一模型所包括的特征提取层提取的、所述第二模拟样本集中的各个模拟样本的特征添加噪声,并且获得基于添加有噪声的特征的预测结果作为所述预先训练的第一模型对所述第二模拟样本集的预测结果。6. An information processing device according to any one of claims 2 to 4, wherein the prediction result of the pre-trained first model for the second simulation sample set is obtained by adding noise to the features of each simulation sample in the second simulation sample set extracted by using the feature extraction layer included in the pre-trained first model, and obtaining the prediction result based on the feature with added noise as the prediction result of the pre-trained first model for the second simulation sample set. 7.根据权利要求1至4中任一项所述的信息处理装置,其中,通过所述预先训练的第一模型所包括的损失层的输出向量来表征所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异。7. An information processing device according to any one of claims 1 to 4, wherein the difference between the prediction results of the pre-trained first model for the first simulated sample set and the prediction results of the retrained first model for the first simulated sample set is represented by the output vector of the loss layer included in the pre-trained first model. 8.根据权利要求1至4中任一项所述的信息处理装置,其中,所述预先训练的第一模型是神经网络模型。8. The information processing device according to any one of claims 1 to 4, wherein the pre-trained first model is a neural network model. 9.一种信息处理方法,包括:9. An information processing method, comprising: 利用基于针对第二任务的第二训练样本集而获得的样本生成网络,生成针对第一任务的第一模拟样本集;以及generating a first simulated sample set for the first task using a sample generation network obtained based on a second training sample set for the second task; and 在第一预定条件下,利用所述第二训练样本集对针对所述第一任务预先训练的第一模型进行再训练,以获得用于对待预测对象进行预测的再训练的第一模型,Under a first predetermined condition, retraining a first model pre-trained for the first task using the second training sample set to obtain a retrained first model for predicting the object to be predicted, 其中,所述第一预定条件包括:所述预先训练的第一模型对所述第一模拟样本集的预测结果与所述再训练的第一模型对所述第一模拟样本集的预测结果之间的差异在预定范围内,The first predetermined condition includes: the difference between the prediction result of the pre-trained first model on the first simulation sample set and the prediction result of the retrained first model on the first simulation sample set is within a predetermined range, 其中,所述第一任务不同于所述第二任务,wherein the first task is different from the second task, 其中,所述待预测对象涉及第一任务和/或第二任务。The object to be predicted involves the first task and/or the second task. 10.一种存储有程序的计算机可读存储介质,所述程序在由计算机执行时使得所述计算机执行根据权利要求9所述的信息处理方法。10 . A computer-readable storage medium storing a program which, when executed by a computer, causes the computer to execute the information processing method according to claim 9 .
CN202310050865.8A 2023-01-20 2023-01-20 Information processing device, information processing method and computer readable storage medium Pending CN118378076A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202310050865.8A CN118378076A (en) 2023-01-20 2023-01-20 Information processing device, information processing method and computer readable storage medium
JP2024006142A JP2024103463A (en) 2023-01-20 2024-01-18 Information processing device, information processing method, and computer program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310050865.8A CN118378076A (en) 2023-01-20 2023-01-20 Information processing device, information processing method and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN118378076A true CN118378076A (en) 2024-07-23

Family

ID=91911256

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310050865.8A Pending CN118378076A (en) 2023-01-20 2023-01-20 Information processing device, information processing method and computer readable storage medium

Country Status (2)

Country Link
JP (1) JP2024103463A (en)
CN (1) CN118378076A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120122642A (en) * 2025-02-20 2025-06-10 酷睿程(北京)科技有限公司 Control method, training method, electronic device, chip, vehicle and medium

Also Published As

Publication number Publication date
JP2024103463A (en) 2024-08-01

Similar Documents

Publication Publication Date Title
CN113254599B (en) A Multi-label Microblogging Text Classification Method Based on Semi-Supervised Learning
US12423295B2 (en) Text to question-answer model system
CN111582348A (en) Method, device, equipment and storage medium for training condition generating type countermeasure network
CN110889487A (en) Neural network architecture search apparatus and method, and computer-readable recording medium
CN107220231A (en) Electronic equipment and method and training method for natural language processing
JP2025517085A (en) Contrasting Caption Neural Networks
CN111444967A (en) Training method, generation method, device, equipment and medium for generating confrontation network
CN108268629B (en) Image description method and device based on keywords, equipment and medium
CN113785314A (en) Semi-supervised training of machine learning models using label guessing
US20220147758A1 (en) Computer-readable recording medium storing inference program and method of inferring
CN114372475A (en) A network public opinion sentiment analysis method and system based on RoBERTa model
CN106354856A (en) Deep neural network enhanced search method and device based on artificial intelligence
US20230124177A1 (en) System and method for training a sparse neural network whilst maintaining sparsity
CN111027292B (en) A method and system for generating a limited sampling text sequence
CN111858878A (en) Method, system and storage medium for automatically extracting answer from natural language text
CN110968725A (en) Image content description information generation method, electronic device, and storage medium
CN116595953A (en) A Summary Generation Method Based on Knowledge and Semantic Information Enhancement
CN112446206A (en) Menu title generation method and device
CN107832298A (en) Method and apparatus for output information
JP2024174994A (en) Video generation and organization model acquisition method, apparatus, device and storage medium
CN118378076A (en) Information processing device, information processing method and computer readable storage medium
Kumar et al. Kullback–Leibler Divergence-Based Regularized Normalization for Low-Resource Tasks
CN118245602B (en) Training method, device, equipment and storage medium for emotion recognition model
CN118484384A (en) Intelligent system hard label robustness testing method and system based on improved genetic algorithm
JP7333490B1 (en) Method for determining content associated with audio signal, computer program stored on computer readable storage medium and computing device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination