[go: up one dir, main page]

CN114049686A - Signature recognition model training method and device and electronic equipment - Google Patents

Signature recognition model training method and device and electronic equipment Download PDF

Info

Publication number
CN114049686A
CN114049686A CN202111345986.2A CN202111345986A CN114049686A CN 114049686 A CN114049686 A CN 114049686A CN 202111345986 A CN202111345986 A CN 202111345986A CN 114049686 A CN114049686 A CN 114049686A
Authority
CN
China
Prior art keywords
sample
signature
utilized
picture
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111345986.2A
Other languages
Chinese (zh)
Inventor
王晓燕
黄聚
钦夏孟
范森
吕鹏原
章成全
姚锟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202111345986.2A priority Critical patent/CN114049686A/en
Publication of CN114049686A publication Critical patent/CN114049686A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供了签名识别模型训练方法、装置及电子设备,涉及人工智能技术领域,具体为深度学习、计算机视觉技术领域。具体方案为:从预定的样本库中,获取待利用样本以及对应的姓名标注信息;其中,所述样本库包括第一类样本和对应的姓名标注信息;第一类样本为在基于该签名识别模型进行签名识别时,经反馈识别结果错误的签名图片;获取待利用样本对应的文本行图片,其中,文本行图片为待利用样本中的签名区域;基于待利用样本对应的文本行图片和姓名标注信息,对签名识别模型进行更新训练。通过本方案,可以自动进行训练迭代签名识别模型,大大减少了迭代签名识别模型的人力成本。

Figure 202111345986

The present disclosure provides a training method, device and electronic device for a signature recognition model, and relates to the technical field of artificial intelligence, in particular to the technical fields of deep learning and computer vision. The specific scheme is: obtaining samples to be used and corresponding name annotation information from a predetermined sample library; wherein, the sample library includes the first type of samples and the corresponding name annotation information; the first type of samples is identified based on the signature. When the model performs signature recognition, the signature picture with the wrong recognition result is fed back; the text line picture corresponding to the to-be-used sample is obtained, wherein the text-line picture is the signature area in the to-be-used sample; based on the text line picture and name corresponding to the to-be-used sample Label information, and update and train the signature recognition model. Through this solution, the iterative signature recognition model can be automatically trained, which greatly reduces the labor cost of the iterative signature recognition model.

Figure 202111345986

Description

签名识别模型训练方法、装置及电子设备Signature recognition model training method, device and electronic device

技术领域technical field

本公开涉及人工智能技术领域,尤其涉及深度学习、计算机视觉技术领域,具体涉及签名识别模型训练方法、装置及电子设备。The present disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of deep learning and computer vision, and in particular to a training method, apparatus and electronic device for a signature recognition model.

背景技术Background technique

对于数字化企业而言,对于手写签名的智能识别具有较高的需求。所谓的对于手写签名的智能识别即自动化地识别手写签名的签名内容。For digital enterprises, there is a high demand for intelligent recognition of handwritten signatures. The so-called intelligent recognition of handwritten signatures is to automatically recognize the signature content of handwritten signatures.

利用签名识别模型,对包含有手写签名的签名图片进行识别,是一种常见的识别手段。为了保证签名识别模型的精度,需要对签名识别模型进行迭代,即对模型进行更新训练。相关技术中,通过人工方式更新训练模型,例如:人工选取新的签名图片作为样本,对样本中的签名区域以及姓名信息进行人工标注等。Using the signature recognition model to recognize signature pictures containing handwritten signatures is a common recognition method. In order to ensure the accuracy of the signature recognition model, the signature recognition model needs to be iterated, that is, the model is updated and trained. In the related art, the training model is manually updated, for example, a new signature image is manually selected as a sample, and the signature area and name information in the sample are manually marked.

发明内容SUMMARY OF THE INVENTION

本公开提供了一种用于签名识别模型训练方法、装置及电子设备。The present disclosure provides a method, apparatus and electronic device for training a signature recognition model.

根据本公开的一方面,提供了一种签名识别模型训练方法,所述方法包括:According to an aspect of the present disclosure, a method for training a signature recognition model is provided, the method comprising:

从预定的样本库中,获取待利用样本以及对应的姓名标注信息;其中,所述样本库包括第一类样本和对应的姓名标注信息;所述第一类样本为在基于所述签名识别模型进行签名识别时,经反馈识别结果错误的签名图片;Obtain the samples to be used and the corresponding name annotation information from a predetermined sample library; wherein, the sample library includes the first type of samples and the corresponding name annotation information; the first type of samples is based on the signature recognition model When performing signature recognition, the signature picture with the wrong recognition result is fed back;

获取所述待利用样本对应的文本行图片,其中,所述文本行图片为所述待利用样本中的签名区域;acquiring the text line picture corresponding to the sample to be used, wherein the text line picture is the signature area in the sample to be used;

基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练。Based on the text line picture and name annotation information corresponding to the sample to be used, the signature recognition model is updated and trained.

根据本公开的另一方面,提供了一种信息处理系统,包括:签名识别子系统和模型训练子系统;According to another aspect of the present disclosure, an information processing system is provided, including: a signature recognition subsystem and a model training subsystem;

签名识别子系统,用于当接收到用户针对待识别图片反馈识别结果错误时,获取待识别图片的姓名,将所述待识别图片作为第一类样本,以及将所述姓名作为对应的姓名标注信息存储入预定的样本库中;其中,所述识别结果为基于预先训练完成的签名识别模型,对所述待识别图片进行识别,所得到的结果;The signature recognition subsystem is used to obtain the name of the to-be-recognized image when receiving the user's feedback that the recognition result is incorrect for the to-be-recognized image, take the to-be-recognized image as a first-type sample, and label the name as a corresponding name The information is stored in a predetermined sample library; wherein, the recognition result is the result obtained by recognizing the to-be-recognized picture based on a pre-trained signature recognition model;

所述模型训练子系统,用于从预定的样本库中,获取待利用样本以及对应的姓名标注信息;获取所述待利用样本对应的文本行图片;基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练。The model training subsystem is used for obtaining samples to be used and corresponding name annotation information from a predetermined sample library; obtaining text line pictures corresponding to the samples to be used; based on the text line pictures corresponding to the samples to be used and name annotation information to update and train the signature recognition model.

根据本公开的另一方面,提供了一种签名识别模型训练装置,所述装置包括:According to another aspect of the present disclosure, there is provided an apparatus for training a signature recognition model, the apparatus comprising:

第一获取模块,用于从预定的样本库中,获取待利用样本以及对应的姓名标注信息;其中,所述样本库至少包括第一类样本和对应的姓名标注信息;所述第一类样本为在基于所述签名识别模型进行签名识别时,经反馈识别结果错误的签名图片;a first acquisition module, configured to acquire samples to be used and corresponding name annotation information from a predetermined sample library; wherein the sample library at least includes a first type of samples and corresponding name annotation information; the first type of samples When performing signature recognition based on the signature recognition model, a signature picture with an incorrect recognition result is fed back;

第二获取模块,用于获取所述待利用样本对应的文本行图片,其中,所述文本行图片为所述待利用样本中的签名区域;a second acquisition module, configured to acquire a text line picture corresponding to the to-be-used sample, wherein the text-line picture is a signature area in the to-be-used sample;

训练模块,用于基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练。A training module, configured to update and train the signature recognition model based on the text line picture and name annotation information corresponding to the to-be-used sample.

根据本公开的另一方面,提供了一种电子设备,包括:According to another aspect of the present disclosure, there is provided an electronic device, comprising:

至少一个处理器;以及at least one processor; and

与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述签名识别模型训练方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the above-described signature recognition model training method.

根据本公开的另一方面,提供了一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行上述签名识别模型训练方法的步骤。According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the steps of the above-mentioned signature recognition model training method.

根据本公开的另一方面,提供了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现上述签名识别模型训练方法的步骤。According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program that, when executed by a processor, implements the steps of the above-described signature recognition model training method.

应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

附图说明Description of drawings

附图用于更好地理解本方案,不构成对本公开的限定。其中:The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present disclosure. in:

图1是本公开实施例所提供的签名识别模型训练方法的流程图;1 is a flowchart of a method for training a signature recognition model provided by an embodiment of the present disclosure;

图2(a)是本公开实施例中的待利用样本,图2(b)是本公开实施例中包含签名文字的最小外接矩形;FIG. 2( a ) is a sample to be used in the embodiment of the present disclosure, and FIG. 2( b ) is the minimum circumscribed rectangle including the signature text in the embodiment of the present disclosure;

图3是本公开实施例所提供的信息处理系统的结构示意图;3 is a schematic structural diagram of an information processing system provided by an embodiment of the present disclosure;

图4是本公开实施例所提供的签名识别模型训练装置的结构示意图;4 is a schematic structural diagram of a signature recognition model training device provided by an embodiment of the present disclosure;

图5是用来实现本公开实施例的签名识别模型训练方法的电子设备的框图。FIG. 5 is a block diagram of an electronic device used to implement the signature recognition model training method according to an embodiment of the present disclosure.

具体实施方式Detailed ways

以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

目前,在金融、政务、办公等各个领域的合同、确认文件上都需要办理人手写签名,之后,需要审核办理人手写签名与本人真实姓名是否一致。At present, contracts and confirmation documents in various fields such as finance, government affairs, and office require the applicant's handwritten signature. After that, it is necessary to verify whether the applicant's handwritten signature is consistent with the person's real name.

相关技术中,采用人工审核方式进行签名识别,即识别办理人的手写签名是否与本人真实姓名一致。但是,对于产生大量签名的签名场景而言,需要大量的人工参与审核,效率低。并且,人工审核的方式中,需要频繁沟通、等待审核等,会影响签名用户的使用体验。In the related art, a manual review method is used for signature recognition, that is, it is recognized whether the handwritten signature of the manager is consistent with the person's real name. However, for a signature scenario that generates a large number of signatures, a large amount of manual participation is required for review, which is inefficient. In addition, the manual review method requires frequent communication, waiting for review, etc., which will affect the user experience of the signature user.

基于此,签名识别服务应运而生。签名识别服务可以实现识别手写签名的文字内容,并参与个人身份校验,能够节省90%以上的业务处理时间,例如合同签署时间。并且,减少人工指导,不仅可以节省工作人员的工作量;还能减少干扰,提升用户的签名体验。Based on this, the signature recognition service came into being. The signature recognition service can recognize the text content of handwritten signatures and participate in personal identity verification, which can save more than 90% of business processing time, such as contract signing time. Moreover, reducing manual guidance can not only save the workload of the staff, but also reduce interference and improve the user's signature experience.

针对签名识别服务而言,签名识别模型是实现服务的重要手段。针对签名识别模型而言,在采集包含手写签名的签名图片后,可以对签名图片进行姓名标注,然后,有监督地训练签名识别模型,以通过训练完成的签名识别模型来识别签名图片中的文字内容,即手写签名。可以理解的是,每个人的笔迹风格不同,签名图片特征分布较分散,并且,识别精度受标注数据量的限制。这样使得模型训练时需要大量的签名图片,且签名图片还需要被一一标注,其中,该大量的签名图片覆盖多种类型的手写数据,该多种类型的手写数据即多种文字内容和笔迹风格。For the signature recognition service, the signature recognition model is an important means to realize the service. For the signature recognition model, after collecting the signature image containing the handwritten signature, the signature image can be annotated with names, and then the signature recognition model can be trained supervised to recognize the text in the signature image through the trained signature recognition model. Content, i.e. handwritten signature. It is understandable that each person's handwriting style is different, the distribution of signature image features is scattered, and the recognition accuracy is limited by the amount of labeled data. In this way, a large number of signature images are required for model training, and the signature images also need to be marked one by one, wherein the large number of signature images cover various types of handwritten data, and the various types of handwritten data are various text contents and handwriting. style.

然而,签名识别模型往往只对训练集中可以覆盖或相似的样本有精度保证。为了保证签名识别模型的精度,需要对签名识别模型进行迭代,即对模型进行更新训练。相关技术中,通过人工方式更新训练模型,例如:人工选取大量新的签名图片作为样本,对样本中的签名区域以及姓名信息进行人工标注等,来作为模型的新的训练集,不断地迭代模型。However, signature recognition models tend to have accuracy guarantees only for samples that can be covered or similar in the training set. In order to ensure the accuracy of the signature recognition model, the signature recognition model needs to be iterated, that is, the model is updated and trained. In the related art, the training model is updated manually, for example: manually selecting a large number of new signature pictures as samples, manually labeling the signature area and name information in the samples, etc., as a new training set for the model, and constantly iterating the model. .

可见,通过人工方式,更新训练签名识别模型,需要耗费大量人力成本。It can be seen that updating and training the signature recognition model manually requires a lot of labor costs.

为了解决上述问题,实现对签名识别模型的自动化地更新训练,本公开实施例提供了一种签名识别模型训练方法、系统、装置、设备及存储介质。In order to solve the above problems and realize the automatic update training of the signature recognition model, the embodiments of the present disclosure provide a signature recognition model training method, system, apparatus, device and storage medium.

下面首先对本公开实施例所提供的一种签名识别模型训练方法进行介绍。The following first introduces a method for training a signature recognition model provided by an embodiment of the present disclosure.

其中,本公开实施例所提供的一种签名识别模型训练方法,应用于电子设备中。在实际应用中,该电子设备可以为服务器或终端设备,这都是合理的。另外,该签名识别模型训练方法所应用的场景可以为,柜台签名场景、入职签名场景,OCR(Optical CharacterRecognition,光学字符识别)场景等,但不限于此。Among them, a method for training a signature recognition model provided by an embodiment of the present disclosure is applied to an electronic device. In practical applications, the electronic device may be a server or a terminal device, which is reasonable. In addition, the scenarios to which the signature recognition model training method is applied may be counter signature scenarios, entry signature scenarios, OCR (Optical Character Recognition, Optical Character Recognition) scenarios, etc., but are not limited thereto.

本公开实施例提供的一种签名识别模型训练方法,可以包括如下步骤:A method for training a signature recognition model provided by an embodiment of the present disclosure may include the following steps:

从预定的样本库中,获取待利用样本以及对应的姓名标注信息;其中,所述样本库至少包括第一类样本和对应的姓名标注信息;所述第一类样本为在基于所述签名识别模型进行签名识别时,经反馈识别结果错误的签名图片;Obtain the samples to be used and the corresponding name annotation information from a predetermined sample library; wherein, the sample library at least includes the first type of samples and the corresponding name annotation information; the first type of samples is identified based on the signature. When the model performs signature recognition, the signature picture with the wrong recognition result is fed back;

获取所述待利用样本对应的文本行图片,其中,所述文本行图片为所述待利用样本中的签名区域;acquiring the text line picture corresponding to the sample to be used, wherein the text line picture is the signature area in the sample to be used;

基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练。Based on the text line picture and name annotation information corresponding to the sample to be used, the signature recognition model is updated and trained.

本实施例中,预定的样本库中包括有第一类样本,即基于签名识别模型识别错误的签名图片,以及对应的姓名标注信息;从样本库中获取待利用样本和对应的姓名标注信息;进而,获取所述待利用样本对应的文本行图片,其中,所述文本行图片为所述待利用样本中的签名区域;最终,基于待利用样本对应的文本行图片和姓名标注信息,对签名识别模型进行更新训练。可见,本方案,无需人工选取新的训练集以及人工标注签名区域等,因此,通过本方案可以实现对签名识别模型的自动化地更新训练,大大减少了人工迭代签名识别模型的人力成本。In this embodiment, the predetermined sample library includes the first type of samples, that is, the signature pictures that are wrongly identified based on the signature recognition model, and the corresponding name annotation information; the samples to be used and the corresponding name annotation information are obtained from the sample library; Further, obtain the text line picture corresponding to the sample to be used, wherein the text line picture is the signature area in the to-be-used sample; finally, based on the text line picture and name annotation information corresponding to the to-be-used sample, the signature The recognition model is updated for training. It can be seen that this scheme does not need to manually select a new training set and manually mark the signature area, etc. Therefore, through this scheme, the automatic update and training of the signature recognition model can be realized, which greatly reduces the labor cost of manually iterating the signature recognition model.

下面结合附图,对本公开所提供的一种签名识别模型训练方法进行介绍。The following describes a method for training a signature recognition model provided by the present disclosure with reference to the accompanying drawings.

如图1所示,本公开所提供的一种签名识别模型训练方法,可以包括如下步骤:As shown in FIG. 1 , a method for training a signature recognition model provided by the present disclosure may include the following steps:

S101,从预定的样本库中,获取待利用样本以及对应的姓名标注信息;其中,所述样本库至少包括第一类样本和对应的姓名标注信息;所述第一类样本为在基于所述签名识别模型进行签名识别时,经反馈识别结果错误的签名图片;S101: Obtain samples to be used and corresponding name annotation information from a predetermined sample library; wherein, the sample library at least includes a first type of samples and corresponding name annotation information; the first type of samples is based on the When the signature recognition model performs signature recognition, the signature picture with the wrong recognition result is fed back;

该签名识别模型为基于一定数量级的(如一百万个)签名图片样本以及姓名标注信息,预先训练得到的人工智能模型。该签名识别模型可以通过对任一种深度学习模型训练得到,深度学习模型的结构可以根据实际情况设定。The signature recognition model is an artificial intelligence model pre-trained based on a certain order of magnitude (eg, one million) signature image samples and name annotation information. The signature recognition model can be obtained by training any deep learning model, and the structure of the deep learning model can be set according to the actual situation.

在签名识别模型训练完成后,为了提升签名识别的精度,可以对签名识别模型进行迭代,即对签名识别模型进行更新训练。为了对签名识别模型进行更新训练,可以预先设定针对签名识别模型的模型更新条件,这样,当满足针对签名识别模型的更新条件时,可以开启针对模型的更新训练。示例性的,模型更新条件可以为,每间隔预定的时间;收集的样本达到预定的数量等,这都是可以的。After the signature recognition model training is completed, in order to improve the signature recognition accuracy, the signature recognition model can be iterated, that is, the signature recognition model is updated and trained. In order to update and train the signature recognition model, the model update conditions for the signature recognition model can be preset, so that when the update conditions for the signature recognition model are satisfied, the update training for the model can be started. Exemplarily, the model update condition may be, every predetermined time interval; the number of collected samples reaches a predetermined number, etc., which are all possible.

并且,样本库是一个数据库,即一个长期存储在计算机内的、有组织的、统一管理的大量数据的集合,用于存储作为样本的签名图片及其对应的姓名标注信息,其中,签名图片即包括书写签名的图片。本实施例中,预定的样本库中包括有多个第一类样本以及对应的姓名标注信息。这样,从预定的样本库中所获取的待利用样本可以包括第一类样本。可以理解的是,待利用样本的数量可以为多个,针对每一待利用样本的处理过程均相同。Moreover, the sample library is a database, that is, a collection of large amounts of data stored in the computer for a long time, organized and uniformly managed, used to store the signature pictures as samples and their corresponding name annotation information, wherein the signature pictures are Include a picture of the written signature. In this embodiment, the predetermined sample library includes a plurality of samples of the first type and corresponding name annotation information. In this way, the samples to be utilized obtained from the predetermined sample library may include samples of the first type. It can be understood that the number of samples to be used may be multiple, and the processing process for each sample to be used is the same.

可以理解的是,将初步训练好的签名识别模型应用于提供签名识别服务的签名识别子系统中,这样,签名识别子系统可以利用训练完成的签名识别模型来识别签名信息。而为了构建预定的样本库,该签名识别子系统还具备确认反馈功能,用于记录用户反馈该签名识别模型,识别正确或错误的签名图片及姓名标注信息;进而,可以将经反馈识别结果错误的签名图片,作为第一类样本,并将第一类样本及其对应的姓名标注信息,会自动地存入样本库中。另外,在一种实现方式中,该签名识别子系统还可以记录识别正确或错误的签名图片所对应的标签,这样,就可以根据标签找到用户反馈识别错误的签名图片及对应的姓名标注信息。It can be understood that the preliminarily trained signature recognition model is applied to the signature recognition subsystem that provides the signature recognition service, so that the signature recognition subsystem can use the trained signature recognition model to recognize signature information. In order to build a predetermined sample library, the signature recognition subsystem also has a confirmation feedback function, which is used to record user feedback on the signature recognition model, and identify correct or incorrect signature pictures and name annotation information; The signature image of , as the first-class sample, and the first-class sample and its corresponding name annotation information will be automatically stored in the sample library. In addition, in an implementation manner, the signature recognition subsystem can also record the label corresponding to the correctly identified or incorrectly identified signature image, so that the user feedback and identified incorrectly identified signature image and corresponding name annotation information can be found according to the label.

需要说明的是,姓名标注信息即该签名图片所对应的签名人真实姓名,第一类样本的姓名标注信息的确定方式可以存在多种,例如:在用户反馈针对签名图片识别错误的识别结果时,指示用户手动输入真实姓名,或者,获取数据库中预先记录用户的姓名。It should be noted that the name annotation information is the real name of the signer corresponding to the signature picture, and there are many ways to determine the name annotation information of the first type of samples, for example: when the user feedbacks the wrong recognition result for the signature picture. , instruct the user to manually enter the real name, or obtain the pre-recorded user's name in the database.

可选地,所述样本库还包括第二类样本和对应的姓名标注信息;其中,所述第二类样本为在预定时间段内所生成的、关于预定签名场景的签名图片。Optionally, the sample library further includes a second type of sample and corresponding name annotation information; wherein the second type of sample is a signature picture about a predetermined signature scene generated within a predetermined period of time.

其中,预定的签名场景可以包括但不局限于:入职签名、柜台签名等。可以理解的是,在这些场景下会产生有超大规模的签名图片,且在不断地更新,因此,可以采集预定签名场景内,在预定时间段内所生成的签名图片以及对应的标注信息,添加到预定的样本库中。Wherein, the predetermined signature scene may include but is not limited to: entry signature, counter signature, and the like. It can be understood that in these scenarios, there will be large-scale signature pictures and are constantly updated. Therefore, in the predetermined signature scene, the signature pictures and corresponding annotation information generated within a predetermined period of time can be collected and added. into a predetermined sample library.

这样,预定样本库中除了包括用户反馈识别错误的第一类样本,还可以包括第二类样本,即不断收集的关于预定签名场景的新样本。In this way, the predetermined sample library may include, in addition to the first type of samples for which the user feedback recognition errors, the second type of samples, that is, new samples about the predetermined signature scene that are continuously collected.

可见,第一类样本和第二类样本都是不断更新的样本,将第一类样本和第二类样本及其对应的姓名标注信息不断存入样本库中,可以不断更新和充实样本库,充分利用预定的签名场景中产生的签名图片,从而避免了数据浪费,提升了样本库所覆盖的手写数据类型的范围。It can be seen that the first type of samples and the second type of samples are constantly updated samples. The first type of samples and the second type of samples and their corresponding name annotation information are continuously stored in the sample database, which can continuously update and enrich the sample database. The signature images generated in the predetermined signature scenarios are fully utilized, thereby avoiding data waste and increasing the range of handwritten data types covered by the sample library.

S102,获取所述待利用样本对应的文本行图片,其中,所述文本行图片为所述待利用样本中的签名区域;S102, acquiring a text line picture corresponding to the to-be-used sample, wherein the text-line picture is a signature area in the to-be-used sample;

由于待利用样本中可能包含非签名区域,例如:空白区域,而签名识别模型的训练仅仅需要利用签名区域内的文字内容,因此,在获取到待利用样本后,要提取待利用样本签名区域的文本行图片。Since the sample to be used may contain non-signature areas, such as blank areas, and the training of the signature recognition model only needs to use the text content in the signature area, after obtaining the sample to be used, it is necessary to extract the signature area of the sample to be used. Text line image.

在一种实现方式中,获取所述待利用样本对应的文本行图片,可以包括步骤A1-A2:In an implementation manner, acquiring the text line picture corresponding to the to-be-used sample may include steps A1-A2:

A1,对待利用样本进行签名区域检测,得到检测结果;A1, the sample to be used is tested for the signature area, and the test result is obtained;

其中,检测结果可以为签名区域的坐标信息,如图2(a)为待利用样本,图2(b)为包含签名文字的最小外接矩形。该坐标信息可以根据签名文字的最小外接矩形的四个端点生成,此时,该坐标信息能够唯一表征出签名区域。对所述待利用样本进行签名区域检测的方式可以存在多种。The detection result may be the coordinate information of the signature area, as shown in FIG. 2(a) for the sample to be used, and FIG. 2(b) for the minimum circumscribed rectangle containing the signature text. The coordinate information can be generated according to the four end points of the minimum circumscribed rectangle of the signature text. At this time, the coordinate information can uniquely characterize the signature area. There may be many ways to perform signature area detection on the to-be-used sample.

示例性的,所述对所述待利用样本进行签名区域检测,得到检测结果,可以包括:Exemplarily, the performing signature area detection on the to-be-used sample to obtain a detection result may include:

对所述待利用样本进行二值化处理,并确定二值化处理后的样本中,包含各个文字区域的最小外接矩形,得到签名区域的检测结果;Perform binarization processing on the to-be-used sample, and determine that the binarized sample contains the smallest circumscribed rectangle of each character area, and obtain the detection result of the signature area;

或者,or,

基于预先训练的用于检测签名区域的文字检测模型,检测所述待利用样本中的签名区域,得到检测结果。Based on the pre-trained text detection model for detecting the signature area, the signature area in the to-be-used sample is detected to obtain the detection result.

为了节省人工的标注成本,可以通过对待利用样本二值化进行处理或使用预先训练的文字检测模型检测待利用样本中的签名区域,来得到检测结果。In order to save the cost of manual labeling, the detection result can be obtained by binarizing the sample to be used or using a pre-trained text detection model to detect the signature area in the sample to be used.

可以理解的是,二值化处理的具体过程,即是将待利用样本的彩色图片先进行灰度化处理,得到灰度图像,再将得到的灰度图像的各个像素分为两个等级。示例性的,可以针对灰度图像设置一个合适的阈值,以确定某像素是目标还是背景,将超过阈值的像素点设置为1,低于阈值的像素点设置为0,从而获得二值化的图像。之后,再确定二值化处理后的样本中,包含各个文字区域的最小外接矩形,得到签名区域的检测结果。It can be understood that the specific process of the binarization process is to first perform grayscale processing on the color picture of the sample to be used to obtain a grayscale image, and then divide each pixel of the obtained grayscale image into two levels. Exemplarily, an appropriate threshold can be set for a grayscale image to determine whether a pixel is a target or a background, and the pixels exceeding the threshold are set to 1, and the pixels below the threshold are set to 0, so as to obtain a binarized image. image. After that, it is determined that the sample after the binarization process contains the smallest circumscribed rectangle of each character area, and the detection result of the signature area is obtained.

并且,文字检测模型为基于各个样本以及文字标注信息,预先训练得到的用于定位文字区域的模型,本公开并不对文字检测模型的具体训练过程进行限定。其中,各个样本对应的文字标注信息,可以通过人工标注获得,也可以通过对样本进行二值化处理的方式所获得,这都是合理的。示例性地,在一种实现方式中,可以选取一定数量的(如五千张)签名图片,并设置标签label,标注出这些签名图片中文字外接矩形的4个角点坐标,从而基于这些签名图片及其4角点坐标的标签label,训练出该文字检测模型。In addition, the text detection model is a model for locating text regions that is pre-trained based on each sample and text annotation information, and the present disclosure does not limit the specific training process of the text detection model. Among them, the text annotation information corresponding to each sample can be obtained by manual annotation, or obtained by binarizing the sample, which is reasonable. Exemplarily, in an implementation manner, a certain number (such as 5,000) of signature pictures can be selected, and a label label is set to mark the coordinates of the four corners of the rectangles circumscribed by the text in these signature pictures, so as to be based on these signatures. The label of the image and its 4 corner coordinates is used to train the text detection model.

A2,基于所述检测结果,从所述待利用样本中提取签名区域,作为所述待利用样本对应的文本行图片。A2, based on the detection result, extract a signature area from the to-be-used sample as a text line picture corresponding to the to-be-used sample.

若提取到的签名区域为包含各个文字区域的最小外接矩形,考虑该矩形往往都是倾斜的,可以在提取签名区域后,对该签名区域进行仿射变换,以将倾斜的矩形调正,方便签名识别模型的识别,从而更好地对该签名识别模型进行训练。If the extracted signature area is the smallest circumscribed rectangle containing each text area, considering that the rectangle is often inclined, you can perform affine transformation on the signature area after extracting the signature area to correct the inclined rectangle, which is convenient for Identification of the signature recognition model, so as to better train the signature recognition model.

可以理解的是,上述生成文本行图片的步骤,可以在满足针对签名识别模型的模型更新条件时,获取待利用样本之后再执行,此时获取所述待利用样本对应的文本行图片的步骤包括步骤A1-A2。此外,为了提高更新训练该签名识别模型的效率,还可以在待利用样本存入样本库时,就自动生成对应文本行图片,并将该文本行图片储存在样本库中,当满足针对签名识别模型的模型更新条件时,直接中样本库中获取生成好的文本行图片即可。It can be understood that the above step of generating a text line picture can be performed after obtaining a sample to be used when the model update condition for the signature recognition model is satisfied. At this time, the step of obtaining the text line picture corresponding to the sample to be used includes: Steps A1-A2. In addition, in order to improve the efficiency of updating and training the signature recognition model, when the sample to be used is stored in the sample library, the corresponding text line picture can be automatically generated, and the text line picture can be stored in the sample library. When the model update condition of the model is set, you can directly obtain the generated text line picture from the sample library.

S103,基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练。S103, based on the text line picture and name annotation information corresponding to the sample to be used, update and train the signature recognition model.

在一种实现方式中,对该签名识别模型进行更新训练可以包括:将待利用样本对应的文本行图片输入签名识别模型,得到该签名识别模型针对待利用样本识别的输出结果;基于输出结果和待利用样本的姓名标注信息,确定损失值;不断调整训练中的签名识别模型的模型参数,继续进行训练,从而不断降低模型的损失值,达到更新迭代的目的。In an implementation manner, updating and training the signature recognition model may include: inputting the text line picture corresponding to the sample to be used into the signature recognition model, and obtaining an output result of the signature recognition model for the sample to be used; based on the output result and The name annotation information of the sample to be used is used to determine the loss value; the model parameters of the signature recognition model under training are continuously adjusted, and the training is continued, thereby continuously reducing the loss value of the model and achieving the purpose of updating and iterating.

另外,在对最初构建好的基础签名识别模型进行训练时,也可以将收集到的签名图片样本进行签名区域检测,并提取其签名区域作为基础文本行图片,再将基础文本行图片及其姓名标注信息作为最初构建的签名识别模型的训练集。In addition, when training the initially constructed basic signature recognition model, the collected signature image samples can also be used for signature area detection, and the signature area can be extracted as the basic text line image, and then the basic text line image and its name can be extracted. The annotation information is used as a training set for the initially constructed signature recognition model.

在具体实现过程中,针对基础签名识别模型的训练,可以从数据库中抽取一定数量的带签字人姓名信息的签名图片,并提取这些签名图片中的签名区域作为基础文本行图片。通过选取带签字人姓名信息的签名图片,可以避免人工对签名图片进行姓名信息的标注。当然,若数据库中的带签字人姓名信息的签名图片的数量不足,也可以选取一定量的未带签字人姓名信息的签名图片,通过人工标注的方式,对该一定量的签名图片进行姓名信息的标注,例如,为未带签字人姓名信息的签名图片设置标签label,标注出这些签名图片中的文字,即签字人的实际姓名。In the specific implementation process, for the training of the basic signature recognition model, a certain number of signature pictures with signer's name information can be extracted from the database, and the signature area in these signature pictures can be extracted as the basic text line picture. By selecting the signature picture with the signer's name information, it is possible to avoid manually labeling the signature picture with the name information. Of course, if the number of signature pictures with the signer's name information in the database is insufficient, a certain amount of signature pictures without the signer's name information can also be selected, and the name information of the certain amount of signature pictures can be marked manually. For example, set the label label for the signature pictures without the signer's name information, and mark the text in these signature pictures, that is, the actual name of the signer.

本实施例中,预定的样本库中包括有第一类样本,即基于签名识别模型识别错误的签名图片,以及对应的姓名标注信息;从样本库中获取待利用样本和对应的姓名标注信息;进而,获取所述待利用样本对应的文本行图片,其中,所述文本行图片为所述待利用样本中的签名区域;最终,基于待利用样本对应的文本行图片和姓名标注信息,对签名识别模型进行更新训练。可见,本方案,无需人工选取新的训练集以及人工标注签名区域等,因此,通过本方案可以实现对签名识别模型的自动化地更新训练,大大减少了人工迭代签名识别模型的人力成本。In this embodiment, the predetermined sample library includes the first type of samples, that is, the signature pictures that are wrongly identified based on the signature recognition model, and the corresponding name annotation information; the samples to be used and the corresponding name annotation information are obtained from the sample library; Further, obtain the text line picture corresponding to the sample to be used, wherein the text line picture is the signature area in the to-be-used sample; finally, based on the text line picture and name annotation information corresponding to the to-be-used sample, the signature The recognition model is updated for training. It can be seen that this scheme does not need to manually select a new training set and manually mark the signature area, etc. Therefore, through this scheme, the automatic update and training of the signature recognition model can be realized, which greatly reduces the labor cost of manually iterating the signature recognition model.

可选地,在另一实施例中,所述待利用样本的数量为多个;所述基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练之前,还包括步骤B1-B4:Optionally, in another embodiment, the number of the samples to be used is multiple; before the signature recognition model is updated and trained based on the text line pictures and name annotation information corresponding to the samples to be used , also including steps B1-B4:

B1,统计各个待利用样本相对应的姓名标注信息中,各个文字的出现频率;B1, count the frequency of occurrence of each character in the name annotation information corresponding to each sample to be used;

由于签名中会出现频率不同的各种文字,如常用字出现频率高,生僻字出现频率少,各文字出现的频率呈现长尾分布的状态,即有大量的文字出现频率较低,而只有各类型文字的样本足够充足时,才能使得签名识别模型达到一个较高的识别精度。因此,可以统计各个待利用样本相对应的姓名标注信息中,各个文字的出现频率,对低频的文字做相应的处理,从而提高签名识别模型对低频文字的识别精度。Since various characters with different frequencies appear in the signature, such as frequently used characters appear frequently, uncommon characters appear less frequently, and the frequency of appearance of each character presents a long-tailed distribution, that is, a large number of characters appear less frequently, and only each character appears in a low frequency. The signature recognition model can achieve a high recognition accuracy only when the samples of type characters are sufficient. Therefore, in the name annotation information corresponding to each sample to be used, the occurrence frequency of each character can be counted, and the low-frequency character can be processed accordingly, thereby improving the recognition accuracy of the signature recognition model for the low-frequency character.

B2,检测出现频率低于预定频率阈值的目标文字;B2, detecting the target text whose frequency of occurrence is lower than the predetermined frequency threshold;

将统计结果中频率低于阈值的文字作为目标文字。Use the text whose frequency is lower than the threshold in the statistical result as the target text.

B3,确定所对应姓名标注信息中包含所述目标文字的待利用样本,作为待处理样本;B3, determine the to-be-used sample containing the target text in the corresponding name annotation information, as the to-be-processed sample;

包含该目标文字的待利用样本,即出现频率低于预定频率阈值的文字所在的样本,都可以作为待处理样本。The to-be-used samples containing the target text, that is, the samples containing the text whose frequency of occurrence is lower than the predetermined frequency threshold, can be used as the samples to be processed.

B4,将所述待处理样本对应的文本行图片,进行数据增强处理,得到目标图片;B4, performing data enhancement processing on the text line picture corresponding to the sample to be processed to obtain the target picture;

示例性的,数据增强处理可以针对待处理样本对应的文本行图片进行图片大小调整(resize)、图片旋转(rotate)、增加各边像素(padding)等处理。此外,也可以针对低频文字人工收集样本,或者由人工撰写出不同字迹类型的低频文字作为目标样本,并提取人工收集、撰写的样本的文本行图片,这也是可以的。Exemplarily, the data enhancement process may perform image resizing (resize), image rotation (rotate), adding pixels on each side (padding), and the like, for the text line image corresponding to the sample to be processed. In addition, it is also possible to manually collect samples for low-frequency characters, or manually write low-frequency characters of different handwriting types as target samples, and extract text line pictures of the manually collected and written samples, which is also possible.

在包含B1-B4的基础上,相应的,所述基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练,可以包括:On the basis of including B1-B4, correspondingly, the updating and training of the signature recognition model based on the text line picture and name annotation information corresponding to the to-be-used sample may include:

基于所述待利用样本对应的文本行图片和姓名标注信息,以及所述目标样本和对应的姓名标注信息,对所述签名识别模型进行更新训练;其中,所述目标样本对应的姓名标注信息为:所述待处理样本对应的姓名标注信息。Based on the text line picture and name annotation information corresponding to the sample to be used, as well as the target sample and the corresponding name annotation information, the signature recognition model is updated and trained; wherein, the name annotation information corresponding to the target sample is: : Name annotation information corresponding to the sample to be processed.

此时,对签名识别模型进行更新训练时所使用的样本包括,待利用样本对应的文本行图片和姓名标注信息、针对低频文字所对应的待处理样本做数据增强处理后得到的目标样本,和对应的姓名标注信息,其中,姓名标注信息即指定样本对应的姓名标注信息。At this time, the samples used in the update and training of the signature recognition model include the text line pictures and name annotation information corresponding to the samples to be used, the target samples obtained after performing data enhancement processing on the samples to be processed corresponding to the low-frequency characters, and Corresponding name annotation information, wherein the name annotation information is the name annotation information corresponding to the specified sample.

此外,还可以统计用户反馈识别错误的第一类样本中文字的出现频率,对第一类样本中的低频文字执行上述步骤,针对模型识别出错的文字中的低频文字,更加精准地训练该签名识别模型。In addition, the frequency of occurrence of characters in the first type of samples that are incorrectly recognized by the user feedback can be counted, and the above steps are performed for the low-frequency characters in the first type of samples, and the signature can be trained more accurately for the low-frequency characters in the wrongly recognized characters by the model. Identify the model.

同样的,在对最初构建的基础签名识别模型进行训练时,也可以统计训练集中的低频文字,针对低频文字所对应的待处理样本做数据增强处理后得到的基础目标样本,再将基础目标样本的及其姓名标注信息作为最初构建的签名识别模型的训练集,训练基础签名识别模型,从而改善训练集中不同文字出现频率的长尾分布。Similarly, when training the initially constructed basic signature recognition model, the low-frequency text in the training set can also be counted, and the basic target sample obtained after data enhancement processing is performed on the to-be-processed sample corresponding to the low-frequency text, and then the basic target sample and its name annotation information as the training set of the initially constructed signature recognition model to train the basic signature recognition model, thereby improving the long-tailed distribution of the frequency of occurrence of different characters in the training set.

本实施例中,针对出现频率低于预定频率阈值的目标文字的待利用样本作为待处理样本,对待处理样本做数据增强处理,得到目标样本,并基于待利用样本对应的文本行图片和姓名标注信息,以及所述目标样本和对应的姓名标注信息,对所述签名识别模型进行更新训练。可见,通过本实施例的方案可以提高签名识别模型的识别精度。In this embodiment, the to-be-used samples of the target text whose frequency of occurrence is lower than the predetermined frequency threshold are taken as the to-be-processed samples, and data enhancement processing is performed on the to-be-processed samples to obtain the target samples, and based on the text line pictures and name annotations corresponding to the to-be-used samples information, as well as the target sample and the corresponding name annotation information, to update and train the signature recognition model. It can be seen that the identification accuracy of the signature identification model can be improved through the solution of this embodiment.

可见,通过本公开所提供的签名识别模型训练方法,可以不断补充更新该签名识别模型识别错误的第一类样本,以及导入新的第二类样本,并基于第一类样本和第二类样本所对应的文本行图片和姓名标注信息,自动更新训练签名识别模型,减少了人工收集训练集、人工标注、人工迭代的成本,同时,保证了训练过程中能够充分利用真实数据。因此,随着数据量不断覆盖更多的签名笔迹及文字样本,该签名识别模型的泛化性和识别精度会不断提升,达到线上使用的效果。It can be seen that through the signature recognition model training method provided by the present disclosure, it is possible to continuously supplement and update the first type of samples that are wrongly identified by the signature recognition model, and import new second type samples, and based on the first type of samples and the second type of samples. The corresponding text line pictures and name annotation information are automatically updated for the training signature recognition model, which reduces the cost of manual collection of training sets, manual annotation, and manual iteration, and at the same time ensures that the real data can be fully utilized in the training process. Therefore, as the amount of data continues to cover more signature handwriting and text samples, the generalization and recognition accuracy of the signature recognition model will continue to improve, achieving the effect of online use.

本公开的技术方案中,所涉及的用户个人信息的收集、存储、使用、加工、传输、提供和公开等处理,均符合相关法律法规的规定,且不违背公序良俗。In the technical solution of the present disclosure, the collection, storage, use, processing, transmission, provision and disclosure of the user's personal information involved are all in compliance with the relevant laws and regulations, and do not violate public order and good customs.

根据本公开的实施例,本公开还提供了一种信息处理系统,如图3所示,该系统包括:签名识别子系统310和模型训练子系统320;According to an embodiment of the present disclosure, the present disclosure also provides an information processing system, as shown in FIG. 3 , the system includes: a signature recognition subsystem 310 and a model training subsystem 320;

签名识别子系统310,用于当接收到用户针对待识别图片反馈识别结果错误时,获取待识别图片的姓名,将所述待识别图片作为第一类样本,以及将所述姓名作为对应的姓名标注信息存储入预定的样本库中;其中,所述识别结果为基于预先训练完成的签名识别模型,对所述待识别图片进行识别,所得到的结果;The signature recognition subsystem 310 is configured to obtain the name of the to-be-recognized image when receiving an incorrect recognition result from the user for the to-be-recognized image, take the to-be-recognized image as a first-type sample, and use the name as the corresponding name The labeling information is stored in a predetermined sample library; wherein, the recognition result is a result obtained by recognizing the to-be-recognized picture based on a pre-trained signature recognition model;

所述模型训练子系统320,用于从预定的样本库中,获取待利用样本以及对应的姓名标注信息;获取所述待利用样本对应的文本行图片;基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练。The model training subsystem 320 is configured to obtain samples to be used and corresponding name annotation information from a predetermined sample library; obtain text line pictures corresponding to the samples to be used; based on the text lines corresponding to the samples to be used The image and name annotation information is used to update and train the signature recognition model.

其中,关于模型训练子系统的各个功能的具体实现,可以参见上述方法实施例中的相应内容,在此不做赘述。For the specific implementation of each function of the model training subsystem, reference may be made to the corresponding content in the foregoing method embodiments, which will not be repeated here.

本实施例中,签名识别子系统接收到用户针对待识别图片反馈识别结果错误时,获取待识别图片的姓名,将所述待识别图片作为第一类样本,以及将所述姓名作为对应的姓名标注信息存储入预定的样本库中;模型训练子系统从预定的样本库中,获取待利用样本以及对应的姓名标注信息;并基于待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练。可见,本方案中,自动对用户反馈的错误样本存入样本库中,并基于待利用样本对应的文本行图片和姓名标注信息,自动对签名识别模型进行更新训练。因此,通过本方案,可以自动进行训练迭代签名识别模型,大大减少了迭代签名识别模型的人力成本。In this embodiment, when the signature recognition subsystem receives feedback from the user that the recognition result of the picture to be recognized is incorrect, it obtains the name of the picture to be recognized, takes the picture to be recognized as the first type sample, and uses the name as the corresponding name The annotation information is stored in a predetermined sample library; the model training subsystem obtains the sample to be used and the corresponding name annotation information from the predetermined sample library; and based on the text line picture and name annotation information corresponding to the sample to be used, the The signature recognition model is updated and trained. It can be seen that in this solution, the error samples fed back by the user are automatically stored in the sample database, and the signature recognition model is automatically updated and trained based on the text line pictures and name annotation information corresponding to the samples to be used. Therefore, through this solution, the iterative signature recognition model can be automatically trained, which greatly reduces the labor cost of the iterative signature recognition model.

根据本公开的实施例,本公开还提供了一种签名识别模型训练装置,如图4所示,该装置包括:According to an embodiment of the present disclosure, the present disclosure also provides a signature recognition model training device, as shown in FIG. 4 , the device includes:

第一获取模块410,用于从预定的样本库中,获取待利用样本以及对应的姓名标注信息;其中,所述样本库至少包括第一类样本和对应的姓名标注信息;所述第一类样本为在基于所述签名识别模型进行签名识别时,经反馈识别结果错误的签名图片;The first acquisition module 410 is configured to acquire samples to be used and corresponding name annotation information from a predetermined sample library; wherein, the sample library at least includes samples of the first type and corresponding name annotation information; the first type of The sample is a signature picture with an incorrect recognition result after feedback when performing signature recognition based on the signature recognition model;

第二获取模块420,用于获取所述待利用样本对应的文本行图片,其中,所述文本行图片为所述待利用样本中的签名区域;The second obtaining module 420 is configured to obtain a text line picture corresponding to the to-be-used sample, wherein the text-line picture is a signature area in the to-be-used sample;

训练模块430,用于基于所述待利用样本对应的文本行图片和姓名标注信息,对所述签名识别模型进行更新训练。The training module 430 is configured to update and train the signature recognition model based on the text line picture and name annotation information corresponding to the to-be-used sample.

可选地,所述样本库还包括第二类样本和对应的姓名标注信息;其中,所述第二类样本为在预定时间段内所采集的、关于预定签名场景的签名图片。Optionally, the sample library further includes a second type of sample and corresponding name annotation information; wherein the second type of sample is a signature picture about a predetermined signature scene collected within a predetermined period of time.

可选地,所述第二获取模块420,包括:Optionally, the second obtaining module 420 includes:

检测子模块,用于对所述待利用样本进行签名区域检测,得到检测结果;a detection submodule, used for performing signature area detection on the to-be-used sample to obtain a detection result;

提取子模块,用于基于所述检测结果,从所述待利用样本中提取签名区域,作为所述待利用样本对应的文本行图片。The extraction sub-module is configured to extract a signature area from the to-be-used sample based on the detection result, as a text line picture corresponding to the to-be-used sample.

可选地,所述检测子模块对所述待利用样本进行签名区域检测,得到检测结果,包括:Optionally, the detection submodule performs signature area detection on the to-be-used sample to obtain a detection result, including:

对所述待利用样本进行二值化处理,并确定二值化处理后的样本中,包含各个文字区域的最小外接矩形,得到签名区域的检测结果;Perform binarization processing on the to-be-used sample, and determine that the binarized sample contains the smallest circumscribed rectangle of each character area, and obtain the detection result of the signature area;

或者,or,

基于预先训练的用于检测签名区域的文字检测模型,检测所述待利用样本中的签名区域,得到检测结果。Based on the pre-trained text detection model for detecting the signature area, the signature area in the to-be-used sample is detected to obtain the detection result.

可选地,所述待利用样本的数量为多个;所述装置还包括:Optionally, the number of the samples to be utilized is multiple; the device further includes:

统计模块,用于统计各个待利用样本相对应的姓名标注信息中,各个文字的出现频率;The statistics module is used to count the frequency of occurrence of each character in the name annotation information corresponding to each sample to be used;

检测模块,用于检测出现频率低于预定频率阈值的目标文字;a detection module, used to detect target characters whose frequency of occurrence is lower than a predetermined frequency threshold;

确定模块,用于确定所对应姓名标注信息中包含所述目标文字的待利用样本,作为指定样本;A determination module, configured to determine the to-be-used sample containing the target text in the corresponding name annotation information as a designated sample;

增强模块,用于将所述指定样本对应的文本行图片,进行数据增强处理,得到目标图片;an enhancement module, configured to perform data enhancement processing on the text line picture corresponding to the specified sample to obtain a target picture;

所述训练模块具体用于:The training module is specifically used for:

基于所述待利用样本对应的文本行图片和姓名标注信息,以及所述目标图片和对应的姓名标注信息,对所述签名识别模型进行更新训练;Based on the text line image and name annotation information corresponding to the sample to be used, and the target image and the corresponding name annotation information, the signature recognition model is updated and trained;

其中,所述目标图片对应的姓名标注信息为:所述指定样本对应的姓名标注信息。Wherein, the name annotation information corresponding to the target image is: name annotation information corresponding to the specified sample.

根据本公开的实施例,本公开还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.

本公开所提供的一种电子设备,可以包括:An electronic device provided by the present disclosure may include:

至少一个处理器;以及at least one processor; and

与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述的签名识别模型训练方法的步骤。a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to cause the at least one process The device can perform the steps of the above-mentioned signature recognition model training method.

本公开所提供的一种计算机可读存储介质,该计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现上述签名识别模型训练方法的步骤。The present disclosure provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned signature recognition model training method are implemented.

在本公开提供的又一实施例中,还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述实施例中上述的签名识别模型训练方法的步骤。In yet another embodiment provided by the present disclosure, there is also provided a computer program product containing instructions, which, when running on a computer, cause the computer to execute the steps of the above-mentioned signature recognition model training method in the above-mentioned embodiment.

图5示出了可以用来实施本公开的实施例的示例电子设备500的示意性框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。5 shows a schematic block diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

如图5所示,设备500包括计算单元501,其可以根据存储在只读存储器(ROM)502中的计算机程序或者从存储单元508加载到随机访问存储器(RAM)503中的计算机程序,来执行各种适当的动作和处理。在RAM 503中,还可存储设备500操作所需的各种程序和数据。计算单元501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。As shown in FIG. 5 , the device 500 includes a computing unit 501 that can be executed according to a computer program stored in a read only memory (ROM) 502 or loaded from a storage unit 508 into a random access memory (RAM) 503 Various appropriate actions and handling. In the RAM 503, various programs and data necessary for the operation of the device 500 can also be stored. The computing unit 501 , the ROM 502 , and the RAM 503 are connected to each other through a bus 504 . An input/output (I/O) interface 505 is also connected to bus 504 .

设备500中的多个部件连接至I/O接口505,包括:输入单元506,例如键盘、鼠标等;输出单元507,例如各种类型的显示器、扬声器等;存储单元508,例如磁盘、光盘等;以及通信单元509,例如网卡、调制解调器、无线通信收发机等。通信单元509允许设备500通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Various components in the device 500 are connected to the I/O interface 505, including: an input unit 506, such as a keyboard, mouse, etc.; an output unit 507, such as various types of displays, speakers, etc.; a storage unit 508, such as a magnetic disk, an optical disk, etc. ; and a communication unit 509, such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.

计算单元501可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元501的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元501执行上文所描述的各个方法和处理。例如,在一些实施例中,签名识别模型训练方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元508。在一些实施例中,计算机程序的部分或者全部可以经由ROM 502和/或通信单元509而被载入和/或安装到设备500上。当计算机程序加载到RAM503并由计算单元501执行时,可以执行上文描述的签名识别模型训练方法的一个或多个步骤。备选地,在其他实施例中,计算单元501可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行签名识别模型训练方法。Computing unit 501 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 501 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above. For example, in some embodiments, the signature recognition model training method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508 . In some embodiments, part or all of the computer program may be loaded and/or installed on device 500 via ROM 502 and/or communication unit 509 . When the computer program is loaded into RAM 503 and executed by computing unit 501, one or more steps of the signature recognition model training method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the signature recognition model training method by any other suitable means (eg, by means of firmware).

本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented. The program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.

在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,也可以为分布式系统的服务器,或者是结合了区块链的服务器。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, a distributed system server, or a server combined with blockchain.

应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present disclosure can be executed in parallel, sequentially, or in different orders. As long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, there is no limitation herein.

上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements, and improvements made within the spirit and principles of the present disclosure should be included within the protection scope of the present disclosure.

Claims (14)

1. A method of signature recognition model training, the method comprising:
obtaining a sample to be utilized and corresponding name marking information from a preset sample library; the sample library comprises a first type of sample and corresponding name labeling information; the first type of sample is a signature picture with an incorrect feedback identification result when signature identification is carried out based on the signature identification model;
acquiring a text line picture corresponding to the sample to be utilized, wherein the text line picture is a signature area in the sample to be utilized;
and updating and training the signature recognition model based on the text line picture and the name marking information corresponding to the sample to be utilized.
2. The method of claim 1, wherein the sample library further comprises a second type of sample and corresponding name tagging information; wherein the second type of sample is a signature picture generated in a predetermined time period and related to a predetermined signature scene.
3. The method according to claim 1 or 2, wherein obtaining the text line picture corresponding to the sample to be utilized comprises:
carrying out signature area detection on the sample to be utilized to obtain a detection result;
and extracting a signature region from the sample to be utilized as a text line picture corresponding to the sample to be utilized based on the detection result.
4. The method of claim 3, wherein the performing signature region detection on the sample to be utilized to obtain a detection result comprises:
carrying out binarization processing on the sample to be utilized, and determining the minimum circumscribed rectangle of each character area in the sample after binarization processing to obtain the detection result of the signature area;
or,
and detecting the signature area in the sample to be utilized based on a pre-trained character detection model for detecting the signature area to obtain a detection result.
5. The method according to claim 1 or 2, wherein the number of samples to be utilized is plural; before updating and training the signature recognition model based on the text line picture and the name marking information corresponding to the sample to be utilized, the method further comprises the following steps:
counting the occurrence frequency of each character in the name marking information corresponding to each sample to be utilized;
detecting target characters with the occurrence frequency lower than a preset frequency threshold;
determining a to-be-utilized sample containing the target characters in the corresponding name labeling information as a to-be-processed sample;
performing data enhancement processing on the text line picture corresponding to the sample to be processed to obtain a target sample;
the updating and training of the signature recognition model based on the text line picture and the name marking information corresponding to the sample to be utilized comprises the following steps:
updating and training the signature recognition model based on the text line picture and the name marking information corresponding to the sample to be utilized, the target sample and the corresponding name marking information;
the name marking information corresponding to the target sample is as follows: and name marking information corresponding to the sample to be processed.
6. An information processing system comprising: the system comprises a signature identification subsystem and a model training subsystem;
the signature identification subsystem is used for acquiring the name of the picture to be identified when receiving that the identification result fed back by a user aiming at the picture to be identified is wrong, taking the picture to be identified as a first type sample, and storing the name as corresponding name marking information into a preset sample library; the identification result is a result obtained by identifying the picture to be identified based on a signature identification model trained in advance;
the model training subsystem is used for acquiring a sample to be utilized and corresponding name marking information from a preset sample library; acquiring a text line picture corresponding to the sample to be utilized; and updating and training the signature recognition model based on the text line picture and the name marking information corresponding to the sample to be utilized.
7. A signature recognition model training apparatus, the apparatus comprising:
the first acquisition module is used for acquiring a sample to be utilized and corresponding name marking information from a preset sample library; the sample library at least comprises a first type of sample and corresponding name labeling information; the first type of sample is a signature picture with an incorrect feedback identification result when signature identification is carried out based on the signature identification model;
the second obtaining module is used for obtaining a text line picture corresponding to the sample to be utilized, wherein the text line picture is a signature area in the sample to be utilized;
and the training module is used for updating and training the signature recognition model based on the text line picture and the name marking information corresponding to the sample to be utilized.
8. The apparatus of claim 7, wherein the sample library further comprises a second type of sample and corresponding name tagging information; wherein the second type of sample is a signature picture generated in a predetermined time period and related to a predetermined signature scene.
9. The apparatus of claim 7 or 8, wherein the second obtaining means comprises:
the detection submodule is used for carrying out signature region detection on the sample to be utilized to obtain a detection result;
and the extraction submodule is used for extracting a signature area from the sample to be utilized as a text line picture corresponding to the sample to be utilized based on the detection result.
10. The apparatus of claim 9, wherein the detection sub-module performs signature region detection on the sample to be utilized to obtain a detection result, and the detection sub-module includes:
carrying out binarization processing on the sample to be utilized, and determining the minimum circumscribed rectangle of each character area in the sample after binarization processing to obtain the detection result of the signature area;
or,
and detecting the signature area in the sample to be utilized based on a pre-trained character detection model for detecting the signature area to obtain a detection result.
11. The apparatus according to claim 7 or 8, wherein the number of samples to be utilized is plural; the device further comprises:
the statistical module is used for counting the occurrence frequency of each character in the name marking information corresponding to each sample to be utilized;
the detection module is used for detecting the target characters with the occurrence frequency lower than a preset frequency threshold;
the determining module is used for determining a sample to be utilized containing the target characters in the corresponding name marking information as an appointed sample;
the enhancement module is used for carrying out data enhancement processing on the text line picture corresponding to the specified sample to obtain a target picture;
the training module is specifically configured to:
updating and training the signature recognition model based on the text line picture and the name marking information corresponding to the sample to be utilized, and the target picture and the corresponding name marking information;
the name marking information corresponding to the target picture is as follows: and name marking information corresponding to the specified sample.
12. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
13. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
14. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-5.
CN202111345986.2A 2021-11-15 2021-11-15 Signature recognition model training method and device and electronic equipment Pending CN114049686A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111345986.2A CN114049686A (en) 2021-11-15 2021-11-15 Signature recognition model training method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111345986.2A CN114049686A (en) 2021-11-15 2021-11-15 Signature recognition model training method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN114049686A true CN114049686A (en) 2022-02-15

Family

ID=80208948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111345986.2A Pending CN114049686A (en) 2021-11-15 2021-11-15 Signature recognition model training method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114049686A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115860667A (en) * 2022-11-28 2023-03-28 中国银行股份有限公司 Order signing method and device
CN116305076A (en) * 2023-03-30 2023-06-23 重庆傲雄在线信息技术有限公司 Signature-based identity information registration sample online updating method, system and storage medium
CN117132996A (en) * 2023-08-28 2023-11-28 中国移动紫金(江苏)创新研究院有限公司 Work order signing behavior detection method, device, equipment and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180060704A1 (en) * 2016-08-30 2018-03-01 Baidu Online Network Technology (Beijing) Co., Ltd. Method And Apparatus For Image Character Recognition Model Generation, And Vertically-Oriented Character Image Recognition
CN110135414A (en) * 2019-05-16 2019-08-16 京北方信息技术股份有限公司 Corpus update method, device, storage medium and terminal
JP2019528520A (en) * 2016-08-31 2019-10-10 富士通株式会社 Classification network training apparatus, character recognition apparatus and method for character recognition
CN111444906A (en) * 2020-03-24 2020-07-24 腾讯科技(深圳)有限公司 Image recognition method based on artificial intelligence and related device
CN111461238A (en) * 2020-04-03 2020-07-28 讯飞智元信息科技有限公司 Model training method, character recognition method, device, equipment and storage medium
CN111626279A (en) * 2019-10-15 2020-09-04 西安网算数据科技有限公司 Negative sample labeling training method and highly-automated bill identification method
CN112200312A (en) * 2020-09-10 2021-01-08 北京达佳互联信息技术有限公司 Method and device for training character recognition model and storage medium
CN112418304A (en) * 2020-11-19 2021-02-26 北京云从科技有限公司 OCR (optical character recognition) model training method, system and device
CN113313022A (en) * 2021-05-27 2021-08-27 北京百度网讯科技有限公司 Training method of character recognition model and method for recognizing characters in image
CN113378833A (en) * 2021-06-25 2021-09-10 北京百度网讯科技有限公司 Image recognition model training method, image recognition device and electronic equipment
CN113569916A (en) * 2021-06-30 2021-10-29 佛山喀视科技有限公司 Tile label recognition model training method and tile label recognition method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180060704A1 (en) * 2016-08-30 2018-03-01 Baidu Online Network Technology (Beijing) Co., Ltd. Method And Apparatus For Image Character Recognition Model Generation, And Vertically-Oriented Character Image Recognition
JP2019528520A (en) * 2016-08-31 2019-10-10 富士通株式会社 Classification network training apparatus, character recognition apparatus and method for character recognition
CN110135414A (en) * 2019-05-16 2019-08-16 京北方信息技术股份有限公司 Corpus update method, device, storage medium and terminal
CN111626279A (en) * 2019-10-15 2020-09-04 西安网算数据科技有限公司 Negative sample labeling training method and highly-automated bill identification method
CN111444906A (en) * 2020-03-24 2020-07-24 腾讯科技(深圳)有限公司 Image recognition method based on artificial intelligence and related device
CN111461238A (en) * 2020-04-03 2020-07-28 讯飞智元信息科技有限公司 Model training method, character recognition method, device, equipment and storage medium
CN112200312A (en) * 2020-09-10 2021-01-08 北京达佳互联信息技术有限公司 Method and device for training character recognition model and storage medium
CN112418304A (en) * 2020-11-19 2021-02-26 北京云从科技有限公司 OCR (optical character recognition) model training method, system and device
CN113313022A (en) * 2021-05-27 2021-08-27 北京百度网讯科技有限公司 Training method of character recognition model and method for recognizing characters in image
CN113378833A (en) * 2021-06-25 2021-09-10 北京百度网讯科技有限公司 Image recognition model training method, image recognition device and electronic equipment
CN113569916A (en) * 2021-06-30 2021-10-29 佛山喀视科技有限公司 Tile label recognition model training method and tile label recognition method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
贾建忠;: "基于小波变换和CPN网络的手写签名鉴别", 计算机与现代化, no. 07, 15 July 2020 (2020-07-15) *
贾真;冶忠林;尹红风;何大可;: "基于Tri-training与噪声过滤的弱监督关系抽取", 中文信息学报, no. 04, 15 July 2016 (2016-07-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115860667A (en) * 2022-11-28 2023-03-28 中国银行股份有限公司 Order signing method and device
CN116305076A (en) * 2023-03-30 2023-06-23 重庆傲雄在线信息技术有限公司 Signature-based identity information registration sample online updating method, system and storage medium
CN116305076B (en) * 2023-03-30 2024-03-08 重庆亲笔签数字科技有限公司 Signature-based identity information registration sample online updating method, system and storage medium
CN117132996A (en) * 2023-08-28 2023-11-28 中国移动紫金(江苏)创新研究院有限公司 Work order signing behavior detection method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN113705554A (en) Training method, device and equipment of image recognition model and storage medium
CN114049686A (en) Signature recognition model training method and device and electronic equipment
CN113780098B (en) Character recognition method, character recognition device, electronic equipment and storage medium
CN113627439A (en) Text structuring method, processing device, electronic device and storage medium
CN113239807B (en) Methods and devices for training bill recognition models and bill recognition
CN112861841B (en) Training method, device, electronic equipment and storage medium of bill confidence value model
CN114187448A (en) Document image recognition method and apparatus, electronic device, computer readable medium
CN114092948B (en) A bill identification method, device, equipment and storage medium
CN112560754A (en) Bill information acquisition method, device, equipment and storage medium
CN113033431A (en) Optical character recognition model training and recognition method, device, equipment and medium
CN113610809A (en) Fracture detection method, fracture detection device, electronic device, and storage medium
CN113420174B (en) Difficult sample mining method, device, equipment and storage medium
CN114494751A (en) License information identification method, device, equipment and medium
CN110782390A (en) Image correction processing method and device, electronic equipment
CN114596188A (en) Watermark detection method, model training method, device and electronic equipment
CN114445825A (en) Character detection method, device, electronic device and storage medium
CN112528610A (en) Data labeling method and device, electronic equipment and storage medium
CN116311296A (en) A picture recognition method, device, equipment and storage medium
CN115934928A (en) Information extraction method, device, equipment and storage medium
CN114924959A (en) Page testing method and device, electronic equipment and medium
CN114419636A (en) Text recognition method, device, equipment and storage medium
CN113762109A (en) Training method of character positioning model and character positioning method
CN113887441B (en) Table data processing method, device, equipment and storage medium
CN114863455B (en) Method and apparatus for extracting information
CN116543396A (en) Document processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20250207

AD01 Patent right deemed abandoned